Utrecht Lecture Notes 


INTRODUCTION TO 
SPECIAL RELATIVITY 


B. de Wit 
Institute for Theoretical Physics & Spinoza Institute 


Department of Physics and Astronomy 
Utrecht University 


Contents 
1 Introduction 
2 Towards a theory of special relativity 


8 


9 


Playing with light signals 

The longitudinal Doppler shift 
Time dilation and the twin paradox 
The Lorentz-FitzGerald contraction 
Addition rule for parallel velocities 
The special Lorentz transformation 


Space and time for accelerated observers 


10 Transformation of velocities 


11 Energy and momentum 


12 Particles with zero mass 


13 The concept of mass 


August 2001 


14 The four-dimensional world 

15 Physics in Minkowski space 

16 Invariant mass 

17 Transformation of forces 

18 A charged particle in a uniform constant electric field 

19 A charged particle in a uniform constant magnetic field 

20 Charge and current densities 

21 Invariance of Maxwell’s equations 

22 Electromagnetic forces acting on a point charge 

23 Electromagnetic fields generated by a moving point charge 


24 Bibliography 


52 


56 


58 


60 


62 


64 


65 


68 


72 


73 


76 


1 Introduction 


In physics as well as in daily life, one is often forced to view the same phenomenon from the 
point of view of different observers. When these observers are in some sense comparable, 
for instance when they speak the same language and share the same cultural background, 
use comparable methods for measurements and have the necessary skills so that they can 
communicate and exchange the results of their observations in a meaningful way, there 
should be no major problems. In physics one often thinks in terms of ideal observers. We 
specify whether the observer is at rest or not, what his position is and what kind of signals 
he can observe. When discussing an actual experiment, we have to address the accuracy 
of the measurement and we usually quote the results of a measurement by specifying their 
degree of accuracy. 

However, matters tend to become more subtle when the observers are in less comparable 
situations. Often we are used to this fact and adapt to it more or less automatically. When 
we view the world from a moving train, we know that the world outside is not really moving 
and blame the apparent motion of the outside world on the movement of the train. When 
we hear the frequency shift in the siren of a passing fire truck, we know that the driver 
does not change the frequency just to acknowledge that he is overtaking us. When we drop 
a stone from a driving car, we know that it can seriously hurt an innocent bystander, as 
the stone will inherit its motion from the car. 

When we don’t know these things from experience, then it becomes a different matter. 
When living in a space shuttle, one quickly finds out that earthly experience is not of much 
use and one has to adapt to a different world with different values. Of course, one may try 
to extrapolate from known situations, but whether or not this is very helpful remains to 
be seen. 

In these lectures we shall consider physical phenomena from the viewpoint of different 
unaccelerated observers moving at a constant relative velocity. In principle we know very 
well how to compare the results of the measurements from the two observers. Of course, 
the two observers will never agree on the position and the velocity of an object that they 
both observe, but we can give a precise translation between the sets of measurements 
provided by the two. In classical mechanics this translation is provided by the Galilei 
transformation, which we now discuss. 

Consider an event viewed by two observers who move at constant relative velocity Uv. 
An event is an instantaneous phenomenon characterized by the position and the time that 
it occurred. Both observers, call them O and ©’, have their own set of “values”: they 
have their own clock and have set up their own coordinate system, denoted by S and 
S’, respectively, which they can use to indicate when and where an event takes place. 
Before comparing the values of the measurements of the two observers one must make 
some suitable calibration. The observers should for instance agree that they measure time 
in seconds and lengths in meters. Furthermore they should at some point compare the 
reading of their clocks and measure their relative position. To facilitate matters we assume 
that the two observers have met and used that opportunity to synchronize their clocks. 
More specifically, let us assume that the observers synchronized their clocks at t = t/ = 0. 


At that moment they were, just for an instant, at the same position. Both observers have 
set up their own coordinate system such that they are located at its origin. Furthermore 
they took a quick look at the distant stars at t = t’ = 0, and agreed to direct the axes of 
their coordinate frames in the same way. In other words, the angular orientation of the 
coordinate frames S and S’ is the same. 

After all this calibration we expect that measurements of the two observers of a certain 
event are related as follows. When the first observer measures the event at position 7 and 
time t, and the second observer at position 7” and time t’, then these coordinates and times 
should be related by 

r=r—vt, t= (1.1) 
where U is the constant velocity of the observer O’ as seen by O. Of course, according to 
O', the observer O moves with velocity —@ (as is obvious from the inverse of the relation 
(1.1), which gives 7 = 7’ + Ut’). 

The above relation (1.1) is called a Galilei transformation. From it we can derive that 
the velocity of some object, as measured by the two observers, will differ by their relative 
velocity v. To see this, assume that the two observers make each two measurements of the 
object at time t; = t; and tg = th, separated by a time increment At = At’ = tg — th. 
The first observer locates the object at r, and rg at times t, and fo, respectively, while the 
second observer measures the positions 7;' and 79’. According to (1.1) the results of these 
measurements are related by 


M=m1-0h, fh = -Ttp. (1.2) 

The displacement during time interval At as measured by the two observers is thus given 
by 

Ar’ = Ar—W@At. (1:3) 


Therefore the velocity of the object measured by the two observers (defined by wu’ = 
Ar"/At’ and uv = Ar/At in the limit of vanishing At) satisfies 


a(’)=at)—v, tat’. (1.4) 


This formula expresses the simple fact that the velocities can be added: to obtain the 
velocity measured in one frame in terms of that measured in another frame, one simply 
adds or subtracts the relative velocity between the two frames. This is something we take 
for granted in ordinary life. When walking with a velocity of 5 km/h in a train moving 
at a speed of 100 km/h, we know that our velocity with respect to an observer outside is 
equal to 105 or 95 km/h, depending on whether we walk into or opposite to the direction 
of motion of the train. 

In more abstract terms, the orbit of a moving object is described by a function of the 
time t (or rather three functions, specifying the values of each of the three coordinates 
of the object). In S’ the orbit is described by the function 7’(t’) and in S by r(t). The 
velocity in the two frames is obtained by differentiating these functions with respect to tf, 
or equivalently to t’. Using (1.1) this leads directly to (1.4). 


4 


Exercise 1.1: Vectors and scalars — Here and henceforth we make use of a vector notation. Vec- 
tors define a direction (here in a three-dimensional space) and a length. They are usually specified 
in terms of the components defined by the projections onto a Cartesian coordinate system con- 
sisting of three mutually orthogonal axes. Hence vectors can be specified as three-dimensional 
arrays. One can take linear combinations of vectors and there is a so-called inner product; for 
two vectors, p and q, the inner product is defined by p- ¢ = prdz + Pydy + Pzdz- Prove that the 
square of the length of a vector is equal to the inner product of the vector with itself, while the 
inner product of two vectors is equal to the product of their lengths times the cosine of the angle 
between them. Argue that the inner product is invariant under uniform rotations of the vectors 
(or, equivalently, under rotations of the coordinate frame). Quantities that are insensitive to 
rotations (or, in other words, that remain invariant under rotations) are often called scalars. 


Following the same reasoning, we can now also determine the acceleration a(t) = 
du(t)/dt as measured by the two observers. Because Uv, the relative velocity of the two 
observers was constant (i.e. time independent) the change of the velocity per unit of time 
must be the same for both observers. Indeed, differentiating (1.4) with respect to t gives 


a()=a(t), tat’. (1.5) 


However, when the accelerations are the same in both frames, also the force applied to the 
object must be the same according to the two observers, in view of Newton’s second law, 
F =mda. Hence we conclude that 


> 


F'j=F(t), t=. (1.6) 


Here we implicitly assumed that the observers themselves move uniformly in the sense that 
they are not themselves accelerated. This means that Newton’s first law is valid in both 
S and S’: objects that are not accelerated with respect to either one of the observers 
are not subject to a force. Frames that satisfy this condition are called inertial frames. 
Obviously, two inertial frames can only move at constant relative velocity. Of course, in 
many practical situations frames attached to the earth can be considered as inertial, but 
this is only an approximation as the earth itself rotates. A better approximation is to 
choose a frame directed to certain distant stars. However, these practical difficulties will 
not concern us here. 

The fact that the time is the same for both observers is an essential ingredient in 
deriving these relatively simple relations. Time is an absolute quantity in the sense that 
it can be specified without making reference to a particular frame. The laws of classical 
mechanics are invariant under Galilei transformations. We did not really derive this, as we 
simply assumed that Newton’s second law remained valid for both observers. In this way 
we concluded that forces are the same to both observers as stated in (1.6). By invariant we 
simply mean that the various quantities measured in an inertial frame always satisfy the 
same equations of classical mechanics. This property is often formulated in terms of the 
so-called relativity postulate. According to this postulate, there is no preferential (inertial) 


frame in the sense that an observer will not be able to decide on the basis of his own 
measurements whether he is at rest or not. Two observers moving at constant relative 
velocity are completely equivalent. They can decide on the basis of their measurements 
that they move relative to some other observer, but not that they are at rest in an absolute 
sense. In other words, there is no absolute reference system. 


Exercise 1.2: Quantities conserved under Galilei transformations — Show that the kinetic energy 
of a particle of mass m transforms under a Galilei transformation (1.4) as 


ESE =E-@-pt+imvr’, 


where p = mw is the momentum of the particle. Consider two particles with masses m, and mg 
and momenta p; and p>, respectively, which collide and produce two particles of mass m3 and 
ma and momenta p3 and py. In a given inertial frame, determine the incoming energy Ey + E> 
and the outgoing energy E3 + E,. We now assume that energy is preserved in this process, 
i.e. that Ey + Eg = E3 + Ey. Examine now whether this holds in another inertial frame by 
applying a Galilei transformation. Derive that the answer is affirmative provided both the total 
momentum and the total mass is conserved in the process. Verify that the latter two conditions 
are themselves consistent with the Galilei transformations. As we shall see later momentum and 
energy conservation continue to hold in the theory of special relativity. However, the mass will no 
longer be conserved! In section 13 we explain what the concept of mass means in special relativity. 


It is important to point out that there are situations in physics which are quite different 
in this respect, where there does exist an obvious reference system. One such situation 
is the propagation of sound waves. Sound waves propagate through a material medium, 
usually air, and one can for instance compare sound waves emitted by a moving source 
to those emitted by a source at rest. Here the transmitting medium defines a natural 
reference system, and the observed effects depend on the motion of the source and the 
observer relative to the medium. This leads for instance to the so-called Doppler effect. 
In view of what follows later, let us briefly exhibit this effect in two different situations. 
Assume that the waves propagate through a medium with velocity v,, so that the frequency 
y and the wave length » of a wave emitted by the source at rest are related by vA = vg. 
When the source is receding with velocity v from an observer who is at rest with respect 
to the medium, the observed frequency v’ and wave length X/ can be expressed in terms of 
y and X. The result reads 


oy 
“T+ 0/y,” 


/ 
V 


N=2XA(1+v/vs). (1.7) 


Observe that v’\’ = v,. On the other hand when the observer is receding with velocity v 
from a source which is at rest with respect to the medium, then the expressions for the 
observed frequency and wave length are rather different and given by 


vy’ =v(1—v/vs), ae (1.8) 


In both cases the observed frequency is reduced in size, albeit at a different rate. Note that 
in the second case the product of the observed frequency and wave length is not equal to 
vs, but to vs — v, which is the sound velocity as measured by the moving observer. 

The relevance of a natural reference system linked to the medium that transmits the 
waves is rather obvious. Light waves are rather different in this respect. Because they 
can travel through vacuum, their transmission does not seem to depend on the presence 
of some material medium. Indeed, light is electromagnetic radiation and it is thus gou- 
verned by Maxwell’s laws of electromagnetism, rather than by Newton’s laws of classical 
mechanics. The consequence of this fact was hard to comprehend by physicists in the nine- 
teenth century, who tried in vain to interpret light as a consequence of oscillations in some 
unobservable material medium, called aether. However, it has been established by many 
experiments that light behaves quite differently. For instance, experiments failed to show 
the aether drift caused by the earth when moving through the aether. Historically the most 
famous experiments were performed by Michelson and Morley!, but later many other ex- 
periments were done, for instance, by employing modern accelerators in which elementary 
particles are accelerated to velocities comparable to the velocity of light. All experimental 
observations confirmed that the velocity of light (in vacuum) is the same in any inertial 
reference frame and equal to? 299 792458 m/s. This result indicates that an absolute ref- 
erence system does not exist and that light is in fact not a material phenomenon, i.e. that 
it is not a manifestation of the properties of some material medium. 

Accepting this fact, surprising as it may be — after all, it violates the addition rule for 
velocities (1.4) which so perfectly agrees with what we experience for material objects in 
daily life — it seems that the rules based on Galilei transformations do simply not apply to 
light. This fact can be taken as evidence that these transformation rules are in fact not 
relevant for electromagnetic phenomena. The purpose of the subsequent discussion is to 
demonstrate that this is indeed the case. 

Consider two charged bodies with charges gq, and q2 at rest and separated by a distance 
r. According to Coulomb’s law, the force between these two particles is equal to 


Ss (1.9) 


where €o is a constant that denotes the so-called permittivity in the vacuum (which depends 
on our choice of units and is not important for what follows). When the charges are of 
equal sign this electrostatic force is repulsive, and for this reason we added the minus sign. 


Now take the same system of two charges but now viewed by an observer who moves at 
a constant velocity v in a direction perpendicular to the line connecting the two charged 
bodies (see figure 1). Again we expect that there is an electrostatic force, given by (1.9). 
However, according to the second observer, the two charges are not at rest so that they will 
induce a magnetic field. According to the Biot-Savart law the moving charge q, induces a 


'A.A. Michelson and E.W. Morley, Am. J. Sci 34 (1887) 333. 
?Nowadays this number is used to define the meter, so that it is exact ! 


71 qi 


(a) (b) 


Figure 1: Two charged bodies with charges gq, and qo, separated by a distance r, viewed 
by an observer at rest (a) or in motion (b) with respect to these charges. 


magnetic field at the location of the second charge q2, equal to 


Ho M1 


Here ji is the so-called permeability in vacuum, related to the permittivity by €9 fg = c-?. 


Due to the magnetic field B the second charge experiences a force (the so-called Lorentz 
force) directed along the electrostatic force and in size equal to 


2 2 
Lo 1q2U VU" qig2 

F' = B= = 1.11 

mag ~ 420 Lie ep? C Anegr?’ ae 


which is attractive for like-sign charges. 

Now we are facing an obvious dilemma. According to the moving observer there is a 
magnetic force, which is absent according to the observer at rest. Hence, the result (1.6) 
that the force should be the same for two inertial observers, does not hold for magnetic 
forces separately and neither does it hold for the combined electromagnetic force, as the 
total force seen by the second observer is equal to 


2 
UV 
ein = FY ot Wie = (1 _ =) Protal : Ciek2) 


So it appears that there is an error somewhere in this derivation. One possibility is that 
Newton’s laws are after all not invariant when converted from one inertial frame to another. 
An alternative possibility is that the mass or the charge is somehow changed, thus affecting 
the above results. Whatever the reason, it is clear that electric and magnetic forces are 
somehow converted into each other when changing from one inertial frame to another. This 
suggests the possibility that not the combined force should be invariant, but rather the 
modulus of the two forces, defined by \/F3 + F2,,- Although there is some truth behind 


this idea, as we shall see in the later part of these lectures®, it is not clear how to prove such 


3A correct relativistic treatment gives F’.,., = \/1 — v?/c? Fyotai for (1.12 
& total 


8 


a result. At any rate the above results are in blatant disagreement with (1.6), so that our 
conclusion must be that there is a conflict between the description of classical mechanics 
based on Newton’s laws and the description of electromagnetism based on Maxwell’s laws. 
However, note that the discrepancy disappears for velocities small compared to the velocity 
of light. 

Whatever the reason for this conflict, it is not very productive to try in an ad hoc 
manner to modify the physical principles involved and subsequently verify whether or not 
the modifications are supported by experiments. It is better to start anew and reconsider 
in a systematic way some of the starting points underlying classical mechanics and elec- 
tromagnetism. The strategy that we will adopt for doing this will be outlined in the next 
section. 


2 Towards a theory of special relativity 


It is not our primary intention to derive the theory of special relativity from one or two 
fundamental principles, establishing the logical necessity for each separate step. Naturally 
we shall adopt a few basic principles to guide our attempts without insisting that the theory 
we are about to derive follows uniquely from the initial starting points. Once the complete 
theory has been set up, the reader will hopefully appreciate its beauty and consistency and 
should be willing to consider its experimental verification. We should perhaps stress again 
that there exists overwhelming experimental evidence that the special theory of relativity 
is indeed correct. It is true that the extrapolation to velocities comparable to the velocity 
of light leads to situations that are often strange and unusual when judged by criteria 
based on daily life. On the other hand, for velocities small compared to that of light, the 
results remain completely in agreement with what one normally experiences. 


Following Einstein’s original treatment* we distinguish the following two basic principles: 


e The relativity postulate. This postulate was already discussed in the previous section 
in the context of Newton’s theory of classical mechanics. According to this principle 
there is no preferred inertial frame, and neither is there an absolute time. The reason 
that we will have to give up the idea of absolute time is related to the second postulate 
given below. As before, the relativity postulate implies that no inertial observer will 
be able to establish whether or not he or she is at rest. This implies that the physical 
laws must be invariant under the transformation from one inertial coordinate system 
to another (accompanied by an appropriate change — to be discussed in due time — 
of the time variable). 


e The light postulate. This postulate is based on the experimental observations that 
the velocity of light in vacuum is independent of the inertial frame. Light propagates 


4In his paper Zur Elektrodynamik bewegter Kérper, Ann. der Physik 17 (1905); translated in The 
principle of relativity, p. 35, Dover 1952. For a thorough and informative account of Einstein’s contribution 
to the theory of relativity, see, A. Pais, Subtle is the Lord ...; The science and the life of Albert Einstein, 
Oxford Univ. Press, 1982. 


in vacuum with a fixed velocity c, irrespective of the velocity at which the source was 
moving when emitting the light. 


It is clear that these postulates must lead to a theory that is different from Newton’s 
theory of classical mechanics, but, as it turns out, the resulting theory remains fully con- 
sistent with Maxwell’s theory of electromagnetism. Therefore we will proceed as follows. 
We shall first consider the relativistic laws of mechanics that follow from these postulates. 
This is done in Part I, consisting of sections 3-13. In this part mathematics is kept to 
a minimum. The central result is the Lorentz transformation, the relativistic analogue 
of (1.1), which relates the values of the coordinates and the time measured in two differ- 
ent inertial frames. Obviously, when the relative velocity of these two frames is small, we 
should recover the Galilei transformation (1.4). The most conspicous feature of the Lorentz 
transformations is that the time is no longer invariant and will change from one inertial 
frame to another! This should not come as a surprise! By insisting that the velocity of 
light remains the same, the time must be affected when transforming from one frame to 
another. We will derive the quantitive relation by setting up a number of thought experi- 
ments, which allow us to deduce the new features. The Lorentz transformations apply also 
to the relativistic version of the momentum and energy of a particle, which are derived as 
well. A surprising consequence of these results is the existence of massless particles, which 
must travel at the speed of light. Part I closes with a discussion of the concept of mass in 
a relativistic context, leading up to Einstein’s equivalence principle. 

In Part II we consider various extensions of these results. We introduce the four- 
dimensional Minkowski space. More mathematical sophistication is required here and we 
make use of matrices and some elementary tensor analysis. After discussing various topics, 
such as the behaviour of charged point particles in electric and magnetic fields, we derive 
how electromagnetic fields change from one inertial frame to another. The correspond- 
ing transformations are obtained from the requirement that the Maxwell equations are 
invariant under the Lorentz transformations. In this way we verify that the conflict be- 
tween mechanical and electromagnetic phenomena is indeed resolved in the special theory 
of relativity. We close by discussing the electromagnetic fields induced by a moving point 
charge. 


10 


PART I 


11 


3. Playing with light signals 


We will now examine a number of situations in order to derive the implications of the light 
and the relativity postulate. Light signals will play an essential role, because we know that 
they travel at the same velocity c in any inertial frame. 

Consider an observer who, at time t = 0, is passed by some vehicle, say a car. In order 
to determine the velocity of the car, the observer sends a light signal in the direction of 
the car at some later time t = 7. A mirror mounted on the car reflects the signal back 
to the observer, who detects it at time t = aT’. From the measurement of the quantity a 
the observer can determine the velocity of the vehicle in the following way. Suppose the 
light signal is reflected by the car at time t = t, when the vehicle has moved a distance 
x, away from the observer. Hence the vehicle travelled a distance x, in a time interval 
t,. Likewise the light signal travelled the same distance x,, but in a shorter time interval 
equal to t, — TJ. Therefore we have the relations 


f= Up Sie ST): (3.1) 
From this result we obtain the time of reflection (according to the observer at rest), 


T 


ss 
1—v/c 


(3.2) 
After reflection, the light signal travels the same distance back, but now in opposite di- 
rection, in order to reach the observer. Using the light postulate, according to which the 
velocity of light is always the same in every inertial frame, we know that the amount of 
time it takes the light signal to travel back from the car to the observer equals the amount 
of time it took the signal originally to travel from the observer to the car. Therefore we 


conclude that 
1 uT 


l—v/e c’ 
where the last equation follows from substituting the result (3.2) for t,. Combining (3.2) 
and (3.3) we obtain an expression for a, 


aT —t,=t, —T = (3:3) 


1l+v/c 
= : 
1—v/c 


(3.4) 


Hence, by measuring the ratio a and using the light postulate, the observer can determine 
the velocity of the passing vehicle. 

Clearly, the crucial assumption in the above argument is that the light signal travels 
at the same velocity back and forth, so that its speed is not influenced by the velocity 
of the car. Of course, this is rather different from the situation one encounters in daily 
life for material objects. If we kick a soccer ball towards a moving car, then the ball is 
returned with a different velocity, which is larger or smaller depending on whether the 
car is moving towards us or away from us. When we repeat the above experiment, but 
now with a material projectile rather than a light signal, the projectile may return after 


12 


colliding with the car but with a velocity that is always smaller than its initial velocity. If 
we denote the latter again by c in order to facilitate comparison with a light signal, then 
standard Newtonian mechanics tells us that the velocity after reflection is at most equal 
to —c+2v. We refer to exercise 3.1 for a derivation of this result. Because the velocity 
after the collision is smaller, the amount of time it takes for the projectile to return to its 
original position is always larger than the amount of time it took to reach the car. In fact 
it is possible that the projectile never returns to its original position and continues to move 
in the same direction as the car, but with a lower velocity. 


Exercise 3.1: Other than light signals — In order to appreciate the difference when having a 
material object, rather than a light signal, colliding with the car, take a projectile of mass pw 
and a car of mass m. Denote the velocity of the projectile and the car before the collision 
by ¢ and v, and after the collision by c’ and v’, respectively. Momentum conservation then 
implies p(c — ¢) = —m(v — v’), and energy conservation (the collision is elastic) requires that 
u(c? — c'?) = —m(v? — v'”). Prove that this leads to the following result for the velocity of the 
projectile c’ after the collision, 
i 2 


c¢ =2vu—ct+ (c—v). 
m+ ph 


Therefore the projectile’s velocity after the collision is at most equal to 2v — c rather than equal 
to —c as for light signals. The maximal velocity is reached provided the collision with the car 
is elastic (we already assumed this) and the mass of the projectile is much less than that of the car. 


Let us return to the experiment with the light signal and now view the same sequence 
of events from the point of view of another observer who is located inside the car. To 
distinguish the two points of view we denote the inertial frame in which the original observer 
is at rest and the car is moving by S. The second frame in which the car is at rest will 
then be denoted by S’. The two frames thus move at a constant relative velocity equal to 
v, and at time t = 0 the two observers are at the same place. 

According to the observer located in the car a light signal is approaching him with 
velocity c, while the first observer who sent the light signal is moving away from him with 
velocity v. After reflection by the mirror the light signal has to travel a larger distance 
back to the first observer than the distance it travelled before the reflection, just because 
meanwhile the first observer has moved further away. Invoking the light postulate the 
second observer concludes that the time interval between the emission and the reflection of 
the signal must be shorter than the time interval between the reflection and the subsequent 
detection by the first observer. Hence we must accept the stunning conclusion that two 
time intervals which are equal when measured in the frame S, are not equal when measured 
in the frame S’. Although the two observers have synchronized clocks, their measurements 
will be in complete disagreement! In other words, the light postulate forces us to give 
up the idea that time can be defined in an absolute way, without making reference to a 
particular reference frame. However, if time intervals measured in different frames need 
not be the same, then we expect that distances do not need to be the same either when 
measured in different frames. This is based on the fact that the velocity of light in different 


13 


inertial frames is the same, so that time measurements can be used to measure distances 
by means of light signals. 

In the course of these lectures we will learn what the quantitative relation is between 
time and distance measurements done in different inertial frames. For the moment we 
will just proceed carefully, and always indicate explicitly how the time measurement is 
done. It is easy to evaluate at which times, according to the second observer, the various 
events take place. The second observer denotes the time and the position at which the 
light signal is emitted by x, and 7", and the time and position at which the light signal is 
again detected by the first observer after the reflection by z!, and a’ T’. Note that we do 
not make reference to the measurements by the first observer, who is at rest in the inertial 
frame S. If the clock of the second observer is synchronized such that at t/ = 0 the two 
observers are at the same place, we derive the following relations 


c= T Uy tel v. (38) 


The time ¢/, at which the light signal is reflected by the mirror is subject to the relations 


/ 


/ 
af aT =7 4", (3.6) 
c c 
Combining (3.5) and (3.6) then leads to 


_ I+uye 
~ 1-v/e 


t=(1+v/c)T’, and a’ (3.7) 
Observe that although the result for the proportionality constants a and a’ are the same 
in the two frames, we have obtained the above results entirely within the new frame S’, 
without making reference to the previous frame S. To see that the results in the two frames 
are also quantitatively different, let us give the expressions for the time intervals t/, — T” 
and a’ T’ — t), which follow directly from the above equation, 


7 vT' 


| meen vreau tteleel 
ea A , 


l—v/c ec 


(3.8) 


r 


These time intervals are obviously not equal, and in no way resemble the corresponding 
relations (3.3) obtained in the frame S. 

It is convenient to depict the same sequence of events as viewed in two different frames 
in a so-called space-time diagram, in which the coordinates are x and ct (in S) or 2’ and ct’ 
(in S’). We choose ct rather than t so that both coordinates have the dimension of length. 
In such a diagram light signals move under an angle of +45° with the vertical axis. Also 
from the diagrams (cf. figure 2) it is obvious that the two time intervals are equal when 
viewed in S, while they differ when viewed in 9’. 

Let us now try to determine more quantitatively what the relation is between time 
measurements in two different frames. Again we make use of the same situation as above. 
A light signal is emitted by someone at rest in an inertial frame S, reflected by some object 


14 


restframe observer restframe car 
ct ct! 

ri receding observer A 
moving car 


Figure 2: Space-time diagram of the emission, reflection and subsequent detection of a 
light signal viewed in the inertial frame where the signal is emitted and detected at rest 
(a) and in the inertial frame where the reflecting object is at rest (b). 


moving with constant velocity v and again detected at the point of emission. In the second 
inertial frame S’ the reflecting object is at rest. Both at the location of the light source 
and at the reflecting object there is a clock. These clocks are thus at rest in S and S’, 
respectively. They were synchronized and started running at the moment that they were 
at the same place. Suppose now that the light signal is emitted at time t, as measured on 
the clock in S, and arrives at the reflecting object at time t’, but now measured on the 
clock in S’. The reflected signal, emitted at time t’ as measured on the clock in S’, finally 
arrives at the source of emission at time t”, again according to the clock in S. Now there 
must be some function, let us denote it by f, which gives you the time t’ at which the light 
arrives at the reflecting object as a function of the time ¢ at which the light signal was 
emitted. Hence we write 

t= f(t), (3.9) 
where we recall once more that t is measured on the clock in S while t’ is measured on 
the clock in S’. Obviously, f must be a monotonically rising function (i.e., f’(t) > 0) so 
that the causal sequence of events remains the same in each frame. Also, for t = 0, t and 
t’ coincide, so that we must have f(0) = 0. 

Now consider the reflected light signal. Again there must be a function that determines 
the return time of the light signal t” as a function of the time ?¢’ that it was reflected. 
According to the relativity postulate this must be the same function f, as we cannot make 
a distinction between the two inertial frames S and S’. Therefore we conclude that 


Pas iie (3.10) 


15 


On the other hand we found previously that t” is proportional to t with proportionality 
constant a as given in (3.4). This shows that f must satisfy the following equation, 


f(F(t)) = at. (it) 


One can prove? that the solution of this equation is given by f(t) = +/at. In view of the 
fact that f(t) must be monotonically rising, the solution is unique, so that the emission 
and the detection time of a light signal, measured by two receding observers moving at a 
relative velocity v, are related by 


nee = k(3) Uererrn ) (3. 12) 


k(8) = 5 with 6 = - (3.13) 


We should stress that temission ANd Ujerection do not refer to the same physical event. The 
above equation gives the arrival time of a light signal terection MVeasured by an observer 
who is receding with a constant velocity v from the source of emission, as a function of the 
emission time temission. Observe that v must necessarily be smaller than c in this situation 
in order for the light signal to catch up with the receding observer. Hence k(3) is only 
defined for || < 1. 

The factor (3.13) incorporates of course the time needed for the light signal to reach 
the receding observer. However, this does not fully account for the answer. To see this 
one may compare the above result to the result one obtains from a conventional nonrela- 
tivistic derivation. In that case we use the relation (3.2), which is valid in the frame S. 
Nonrelativistically the time is the same for any observer, so one would derive 


where we have defined 


temission 
ere = taetection = i= B ’ (3.14) 
which clearly deviates from the previous result (3.12). For small velocities (3.12) and (3.14) 
tend to coincide, as follows from the expansions 


a a ery ae 


Tn order to analyze the solutions of (3.11) we first apply the function f once more to the left- and the 
right-hand side of (3.11). This gives fofof =a f, from which one derives that 
& & ; 


flat) =a f(t). (x) 


For a = 0 we find f(t) = 0, while for a 4 0, equation (*) implies that f’(at) = f’(t), so that f’(t) must be 
constant, unless |a| = 1 (we restrict ourselves to differentiable functions). For a 4 1 equation (*) shows 
that f(0) = 0. Taking the derivative of (3.11) with respect to t yields 


f(FO) FO =o, (+) 


so that for t = 0 and a ¥ 1 equation (**) reads (f’(0))? =a. Therefore, for negative a there are no 


solutions differentiable at t = 0. Unless a = 1 it thus follows that the only solutions differentiable at t = 0 
are f(t) = +,/at. For a = 1 the analysis is more involved and there are many more solutions. With the 
exception of f(t) = t, none of them satisfies the requirement that f(0) =0 and f’(t) > 0. 


16 


(1-6) * = 14+6+6?+---. (3.15) 


The following two sections deal with the consequences of the result (3.12), and discuss 
its effect on measurements of time intervals in different inertial frames. 


Exercise 3.1: Taylor expansions — The expansions (3.15) are examples of so-called Taylor ex- 
pansions. First verify the correctness of the second equation of (3.15) by multiplying both sides 
with (1—). Subsequently consider 1 + 8 = 1+ 58 — 53" + Ge +---and verify it’s correctness 
by taking the square on both sides and comparing terms up to 3°. Verify now the correctness of 
the first equation of (3.15). 


4 The longitudinal Doppler shift 


There is an obvious application of the relation (3.12) derived in the previous section. 
Because the relationship between ¢ and ?¢’ is linear, periodic phenomena in one inertial 
frame remain periodic when measured in another frame that is receding with constant 
relative velocity v. More precisely, if we emit a light pulse at every one of the periodically 
ocurring events, then these pulses are also periodically received by an observer moving 
with constant relative velocity. When the time elapsed between two events is equal to At, 
then, according to (3.12), the time elapsed between two recorded light pulses in the second 
frame is equal to 


At’ = k(8) At, (4.1) 


The frequency v’, which is the reciprocal of the time period as recorded in the receding 
frame, is thus related to the frequency v at which the signals are emitted according to 


Pie Vet 
Vv Gs 176° (4.2) 


An obvious example of a periodic phenomenon is a monochromatic light wave, where 
the successive nodes of the light wave play the role of the periodic events. In that case 
(4.2) may be regarded as the relativistic version of the longitudinal Doppler shift.° For 
velocities small compared to the velocity of light, we find, 


! L=2 
yi=v new ag ome ia oe (4.3) 


The corresponding formula for the wave length is 


N= k(B)A=d aa flte+ e+}, (4.4) 


6The transverse Doppler effect applies to observations made in a direction perpendicular to the direction 
of motion of the light signal. Nonrelativistically there is no such effect. 


alerg 


where we assume that the light velocity is constant and equal to c. 

This result may be compared to the nonrelativistic version (v < c) of the Doppler 
effect where waves (e.g. sound waves), propagating through a medium with velocity v,, 
are emitted by a moving source. The source is receding with velocity v from an observer 
who is at rest with respect to the medium. The observed frequency v’ and wave length X’ 
can be compared to the frequency v and the wave length \ emitted by the source at rest. 
The result, discussed in section 1, reads (cf. 1.7) 


, V 


eae N=A(1+v/vs). (4.5) 


V 


We now observe that (4.4) and (4.5) tend to agree for small velocity v and v, = c, thus 
justifying the term relativistic Doppler shift. Incidentally, when the source is at rest with 
respect to the medium, while the observer is receding with velocity v, then the result for 
the observed frequency and wave length is quite different, as is shown in (1.8). 
Experimentally, the best way to determine the Doppler shift in order to distinguish 
between the predictions (4.4) and (4.5), is by measuring the wave lengths of spectral lines 
emitted by moving atoms. The relativistic effect was already confirmed around 1940 in 
a series of laboratory experiments by Ives and Stillwell’. In the experiments molecular 
hydrogen ions H{ and H3 are accelerated in an electric field. After neutralization and 
dissociation, these ions give rise to excited hydrogen atoms with a velocity of approximately 
8 =5x10-%. The wave lengths of certain spectral lines are measured very accurately, both 
for light emitted in the direction of motion of the atoms (”forward”) and in the opposite 
direction (”backward”). According to (4.4) the wave length in the forward direction is 
equal to ‘ 
0 
Atgrward k( B) Xo k(3) < Ao ’ (4.6) 
where Xo is the wave length of the spectral line as emitted by the atom at rest. The wave 
length measured in the backward direction, on the other hand, is given by 


Nivsennd = k(3) Ao > Ao : (4.7) 


Therefore the average wave length is equal to 


Meeed + Deaniciocd a Ao l= B ; 1 + B 
5 2 pag ap 
Xo 
J/1— Bf . (4.8) 


This expression should be compared to the nonrelativistic result (4.5), according to which 
the average wave length remains equal to Ay. However, Ives and Stilwell reported a shift 
of the average wave length with a value that was precisely in accord with (4.8). 


TELE. Ives and G.R. Stillwell, J. Opt. Soc. Am. 27 (1937) 389; 28 (1938) 215; 31 (1941) 369. 


18 


5 Time dilation and the twin paradox 


So far we have compared times as measured in two different frames that refer to different 
events, namely to the emission and the subsequent reception of a light signal. We would 
like to know, however, what the value is of a time interval between two given events when 
measured in two different inertial frames. From the previous experiment it is easy to derive 
the desired result. Consider the two following events. The first one occurs when the light 
source and the reflecting body are at the same location. We assumed that in that situation 
the observers in the two frames S and S’ synchronized their clocks. The second event is 
the reflection of the light, which according to the first observer takes place at t = t,. Using 
(3.2) we can express t, as a function of the emission time T of the light signal, so that the 
time interval between the two events equals 


I 
te = 5.1 
ij (6.1 
According to the second observer reflection takes place at t/ = t/,, which according to (3.12) 
is equal to 
LST. (5.2) 


So the time interval between the moment that the light source and the reflecting body 
pass each other (t = t/ = 0) and the moment that the light signal is reflected (t, and t', 
respectively) is not equal when measured in different frames. Instead we find 


a ae to =ryt! (5.3) 
CB RGF = oS 
where we introduced the so-called gamma factor 
Sa, (5.4) 


ane rye 


Consequently t, > t/. for two inertial frames moving at nonzero relative velocity! 

At first sight one might be worried about this result. After all, according to the relativity 
postulate, there is no difference between the two frames. Therefore, if one observer claims 
that he measures a larger time interval than another observer, then this other observer 
will claim that he should also measure a larger time interval, so that there is an obvious 
contradiction! It is important to resolve this puzzle. While it is true that there is no 
essential difference between the two inertial frames, there is, however, a subtle difference 
in the way the events occur in the two frames. Both events occur at the position of the 
reflecting object, which is at rest in S’. Therefore in S’ both events take place at the same 
place corresponding to x’ = 0. However, in contradistinction with the situation observed in 
S’, the two events do not occur at the same position when observed in S. Our conclusion is 
therefore that a measurement of a certain time interval is minimal when the two consecutive 
events that define the interval occur at the same position. This time interval is called the 


19 


proper time and is usually denoted by 7. The time interval measured by another observer 
moving at constant relative velocity v is always larger than 7 and given by 


This effect is called teme dilation. 

There are numerous experimental confirmations of the time dilation effect. Consider, 
for instance, unstable elementary particles, such as muons. A muon is a particle which 
resembles the electron, except that it is 200 times as massive and it is very unstable. At 
rest a muon decays with a mean life time of 7, = 2.2 x 10~° seconds into an electron (or a 
positron) and two so-called neutrinos. Muons are continuously created by cosmic rays that 
enter the earth’s atmosphere, and they can also be made in the laboratory. The surprising 
fact is that muons created in the outer layers of the atmosphere, which move with velocities 
close to the velocity of light, live long enough to be detected at the earth’s surface! And 
similarly, in the laboratory one can prepare beams of very energetic muons as well as other 
unstable particles, which exist for sufficiently long periods of time to enable physicists 
to perform scattering experiments. These facts are all explained by the phenomenon of 
time dilation. While the muon decays almost instantly according to an observer who is 
moving at the same velocity as the particle, it can live during a considerable period of time 
according to an observer moving at large relative velocity. The shortest lifetime is thus 
measured by an observer with respect to whom the particle is at rest. The frame in which 
the particle is at rest is called the rest frame. 

There are many other experiments, such as with very accurate atomic clocks located in 
space labs that circle the earth at large velocities. Historically the phenomenon was often 
discussed in the context of the so-called twin paradox. Consider two twin brothers, called 
Caretaker and Adventurer. While Caretaker stays at home, Adventurer leaves home for a 
long journey. He travels a long distance L (according to measurements by Caretaker) with 
velocity v, after which he returns with the same velocity. According to Caretaker the trip 
takes T’ seconds, so L = 5U T. According to Adventurer the trip takes J’ seconds, and the 
maximal distance between him and his brother was L’, so L' = SU a, 

During the trip the two brothers communicate by light signals: every instant of time 
Ar they emit a flash of light, which is recorded by the other one. During the time that 
the two are moving apart, they receive each others light signals with a time period equal 
to k(3) Ar. Later, when they are again approaching each other, the time period between 
two consecutive signals is smaller and equal to k(—@) Ar. Observe that this statement 
applies to both brothers. First they both record signals with time intervals that are larger 
than Av, while later on they receive the signals at a faster rate with time intervals that are 
smaller than Ar. In this respect there is thus no difference between the observations made 
by the two brothers. However, there is a difference when we consider the trip as a whole 
and plot the light signals against time for each of the two. Because it takes Adventurer the 
same amount of time to move away from Caretaker as it takes him to return, he emits a 
total number of $T ‘/Ar signals when moving away and the same number when returning. 
Therefore Caretaker records $T’/Ar signals with time period k(3) Ar and $T"/Az signals 


20 


T’/Azr signals T’/Ar signals 
—_~—___ 


7 


light signals recorded by Caretaker: © -+++++4+-4+-+44+-4-4+-4-44---HHHHHHHHH H+ tt 
SS 
T 
light signals recorded by Adventurer: #4+4+4+444444-4444444444+4+4+4+4441 ~ t! 
n———~—_ >» —— YS 
1, 1,,, 
ge ge 


Figure 3: The twin paradox: Caretaker receives an equal number (namely $T’/Ar) of low 
and high frequency light signals, while Adventurer does not receive the same number of 
low and high frequency signals, but he receives them during equal time periods (equal to 


Lp), 


with time period k(—3) Ar. According to Caretaker the trip of his brother thus takes 


T = $7’ k(@) + $7’ k(-B) = wT" (5.6) 
seconds. 

As shown in fig. 3, the records of Adventurer look different. While moving away Adven- 
turer receives fewer signals from Caretaker than during his return trip, simply because it 
takes the signals from Caretaker a certain amount of time to reach Adventurer. Until Ad- 
venturer’s point of return he has received ($T—L/c)/Ar = $T (1—8)/Ar light signals with 
a time period of k(3) Ar. During his return trip he receives ($T7+L/c)/Ar = $T (1+8)/Ar 
signals with time period k(—) Ar. Therefore Adventurer concludes that his trip takes 


T =1T (1—p)k(G) + IT (1+ BKB) = TF (5.7) 


seconds. Clearly T > T’, so that Caretaker has aged more during the trip than his brother. 
Note that (5.6) and (5.7) are equivalent. 

The paradox arises when one insists that there should be no difference between the 
two brothers. One then concludes that the trip looks the same from the point of view of 
Caretaker as from the point of view of Adventurer. Each sees his brother taking off and 
returning after a while. The crucial difference between the two brothers is, however, that 
Adventurer feels an acceleration when he decides to return, while Caretaker does not. One 
could say that Adventurer jumps from one inertial frame, which is receding from Caretaker 
with a velocity v, to another one, which is approaching Caretaker with velocity v. Before 
and after this jump, one may correctly claim that there is no way to distinguish between 
the brothers, in accord with the relativity postulate. Of course it is possible to perform 
a (time-dependent) coordinate transformation such that Adventurer is at rest during the 
whole trip, but such a transformation is not allowed in the context of the theory of special 


Pad 


relativity, as we are no longer transforming from one inertial frame to another. If one insists 
in performing this transformation one has to make use of the theory of general relativity. 
For now the conclusion is that time is not a global quantity. The meaning of time can be 
attached to a clock or an observer, but cannot be defined for all observers at once. 


Exercise 5.1: Twin travel — Draw a spacetime diagram of the trip of Adventurer, based on 
the restframe of Caretaker. Argue that one can make a similar diagram based on Adventurer’s 
restframe, but that this does not correspond to a given inertial frame. Therefore, draw two other 
spacetime diagrams, one associated with the restframe of Adventurer immediately after his de- 
parture, and another one based on his restframe just prior to his return. 


6 The Lorentz-FitzGerald contraction 


As mentioned earlier, measurements of time differences can be converted into distance 
measurements. Therefore time dilation implies a similar effect for length measurements. 
It turns out that the length of an moving object in the direction of motion is shorter than 
the length measured at rest. This phenomenon is called Lorentz-FitzGerald contraction.® 

Consider an observer who travels along a stick of length \, as measured in the rest 
frame of the stick, with velocity v. According to the observer the stick passes him with 
opposite but equal velocity and when one of the endpoints of the stick passes by, he checks 
his watch. In this way he measures a time difference equal to 7, and because he is travelling 
with velocity v, he concludes that the length of the stick is equal to 0 = vr. 

A second observer who is at rest relative to the stick, or rather two observers with 
synchronized clocks who are located at each of the endpoints of the stick, measure also 
the time when the moving observer’s position coincides with that of the endpoints. They 
measure a time difference t and conclude that the length of the stick is equal to A = vt. 
However, because of time dilation we know that t = y,7. Combining these results, it 
follows that A and @ must be related according to 

(ge. (6.1) 
Yu Yu 
Therefore the length of a moving body along the direction of motion is smaller than the 
corresponding length measured in the rest frame. Consequently, neither time nor distance 
has an absolute meaning. One must always specify in which frame these quantities are 
measured. 

Hence the Lorentz-FitzGerald contraction rests on a proper definition of “length”. As 
it turns out a length measurement must refer to the motion of the observer and as such the 
contraction is of kinematic origin. Nevertheless, it is real and undeniable, just as the time 
dilation effect described in the previous section. Note also that these effects are not in any 


8For an interesting account of FitzGerald’s contribution, see J.S. Bell, in Physics World, vol. 5 no. 9, 
p. 31. 


De 


way caused by the fact that the result of the measurements still have to be carried (usually 
by light signals) over certain distances, which would require us to correct the outcome of 
measuring a time or length interval. One can easily verify that our derivations are free of 
such complications. 

The Lorentz-FitzGerald contraction should not primarily be thought of as a dynamical 
effect, as is sometimes done in the literature, probably because the derivations before 
Einstein’s were based on considerations of the nature of the forces that physically determine 
the length of an object. Of course, accepting the theory of special relativity implies that 
one must insist that also the dynamics is consistent with the principles of special relativity. 
In that case the forces viewed by an observer who moves relative to some object, will change 
such that they do give rise to a length (as measured by this observer) that is contracted 
as compared to the length measured by an observer who is at rest relative to the object. 
However, the object itself is not affected by these measurements and the fact that different 
observers quote different lengths. 


7 Addition rule for parallel velocities 


In daily life we are accustomed to the idea that velocities can be added as vectors. If an 
observer 1 has a velocity vj relative to an observer 2, which in turn moves with velocity 
V93 relative to a third observer 3, then the first observer moves with velocity 


013 = Vig tU23 (7.1) 


with respect to observer 3. Here we assume that all three observers are moving in the 
same direction, so that there is no need for a vector notation. Furthermore we use the 
convention that v,; denotes the velocity of observer 7 with respect to observer j, so that v;; 
is the velocity of observer 7 with respect to observer 7, which is therefore opposite to u,;, 


Vij = TU- (7.2) 
This allows us to write (7.1) in the form 
V12 + V23 + U31 = 0. (7.3) 


Evidently (7.1) and (7.3) cannot be true according to the theory of relativity, because this 
addition rule disagrees with the light postulate. 

In order to determine the relativistic rule for adding parallel velocities (we will discuss 
the more general case later), consider again three observers denoted by 1, 2 and 3. The first 
one is at rest, while 2 is moving with constant velocity v2, and 3 is moving with constant 
velocity v3; in the direction of the positive z-axis. Assume that vo; < v3;. At t = 0 
all three observers are at the same position, which we denote by x = 0, and synchronize 
their watches. At some later time 7), observer 1 emits a light signal which is detected by 
observer 2 at time 7) according to his watch. At that moment observer 2 emits a light 


23 


signal in the same direction, which is detected by observer 3 at time 73 on his watch (see 
figure 4). 

We now want to determine the velocity v32 of observer 3 relative to observer 2. We first 
note that (3.12) implies that 


T3 = k( 32) To, Ty = k(6u)T, (7.4) 
where we use the obvious notation 
ij = al Bg = =e (7.5) 


However, if we ignore the observer 2, we have the situation that a light signal is emitted 
by observer 1 at 7, and received by observer 3 at time 73. So we also have 


T3 = k (G31) T : (7.6) 
Combining (7.4) and (7.6) thus leads to 
k:((G31) = k(G32) k(Ga1) , (7.7) 
or, 
1+ 6s _ 1+ G32 1+ Ga (7.8) 
1— 63, 1 G32 1— Bor 
This can be written as 
B12 G3 G31 + Fiz + G23 + 631 = 0. (7.9) 
In terms of velocities we conclude that (7.3) must be replaced by 
Uj2 + V23 + U31 = ee . (7.10) 
Solving this equation for v13 gives the relativistic addition rule 
ge (7.11) 


~ + U1 U23/C? ” 


This modified addition rule is in agreement with the light postulate. Adding two 
velocities vig = v and v23 = € gives U13 = c, so that the velocity of light remains the same 
in all inertial frames. Of course for velocities small compared to the velocity of light we 
obtain the classical result (7.1). 

Already in 1851 Fizeau performed an experiment to measure the velocity of light in 
moving liquids’. The velocity of light in the liquid at rest is equal to c/n, where n is the 
index of refraction of the liquid. Denoting the velocity of the liquid by v, the velocity of 
light in the moving liquid is equal to 


c/n+uv C 1 
1+v/(nc) n ( a) ace 712) 


°In 1847 Fizeau had already carried out the first terrestrial determination of the speed of light (in air). 


24 


ct 


cT\ 


Figure 4: Space-time diagram of three observers moving at constant relative velocities. 


where we used (7.11) and the dots indicate terms of order v?/c. Fizeau’s experimental 
results were in reasonable agreement with this formula, which was already derived by 
Fresnel in 1818, based on some version of the aether theory. The relativistic derivation 
given here, was published by von Laue in 1907!°. Later Lorentz modified (7.12) by taking 
into account dispersion effects. Dispersion means that the index of refraction depends 
on the frequency of the light. Therefore one has to take into account the fact that the 
frequency experienced by the liquid is shifted as a result of the Doppler effect. Lorentz’ 
corrections were later confirmed in a series of very accurate experiments performed by 
Zeeman. 


Exercise 7.1: Maximal velocity — Rewrite (7.11) as 
ne saa 27 
1+ B18’ 


and consider the values of (3 for —1 < (1,2 <1. Analyze this, for instance, by first proving that 
G3 has only (local) extremal points whenever |(3;| or |G2| equals unity (exclude (3,2 = —1). 


Bs 


8 The special Lorentz transformation 


It is clearly desirable to have a general formula that relates the spatial coordinates and the 
time for a given event as measured in two different inertial frames. This formula follows 


10M. von Laue, Ann. der Physik 23 (1907) 989. 


25 


Figure 5: The two inertial frames S and S’ used in the text. 


from the so-called special Lorentz transformation, which incorporates all the information 
one needs to relate observations done in two different inertial frames. The transformation 
encompasses all the results that we derived before and will serve as a general starting point 
for all subsequent considerations. 

To derive the formula for the Lorentz transformation let us again start from two inertial 
frames, called S and S’. The origins of the two frames are denoted by O and O’ and the 
clocks, which are at rest at the origin of the frames, have been synchronized such that they 
coincide at t = t/ = 0. For simplicity we assume that the two frames move with a relative 
velocity 0 such that the «x-axis, the xz’-axis and the velocity vector v of S’ with respect to 
S are pointing in the same direction (see figure 5). Consider the situation summarized in 
figure 6. From the origin O of S at some positive time, a light signal is emitted along the 
(positive) x-axis. The signal passes the origin O’ of S’ at some later time (v < c) and is 
subsequently reflected by a mirror. After reflection, the light signal passes again the origin 
O' of S’ and is finally detected at the origin O. Let us denote the coordinates and the time 
at which the light signal is reflected by the mirror by x,t in S and 2’,t’ in S’, respectively. 

According to the clock of S the light signal was emitted at O at time t — x/c and 
returned at time t+ 2/c. According to the clock in S’ the light signal passes the origin O’ 
at times t! — v’/c and t’ + 2’/c. Note that it is not relevant for this result whether or not 
the mirror is moving. We only make use of the fact that after reflection, the light signal 
has to travel over the same distance in order to return to the origin O or O’, respectively. 

Now we know that the time of emission of a light signal and the time that it is received 
by an observer who is receding with constant velocity from the source of emission, are 


26 


cti +2! 


ie 


ct }-------f- ; 4 


ct — 2x 


Figure 6: Space-time diagram of two experimental situations described in the text, which 
are used to derive the Lorentz transformation. A light signal is reflected by a mirror and 
the consequences are observed in two inertial frames. 


related according to (3.12). Therefore we have the following two relations, 


(2 = wo(-9 
t+ = = k(8) (" + =) (8.1) 


Combining the above equations, we obtain the special Lorentz transformation 


xv’ =y(x—vt), Gea, (+ = 5) : (8.2) 
where 7, was defined in (5.4). 

Observe that our derivation applies, strictly speaking, only to the case x < ct, because 
the distance x must be travelled by a light signal within a time interval equal to t. However, 
we can consider a slightly different situation where the light signal is emitted at O’ at 
negative time, passes O and is then reflected by the mirror. Meanwhile O’ has passed O 
so that the reflected signal is first recorded by O’ and finally by O. In that case we have 
x >t. We leave the derivation of (8.2) for this case to the reader. 

We derived (8.2) for a light signal moving along the x-axis, but it is easy to see that 
the same result holds for light signals moving parallel to the x-axis, i.e. at some finite 
but fixed value of the coordinates y and z. This implies that plane waves moving along 
the x-axis remain plane waves when observed in the other frame. Actually, it is hard to 


Zt 


see how this could not be the case, as the experiment remains the same under arbitrary 
translations in directions perpendicular to the x-axis. In other words, the result should 
remain independent of the value of the y and z coordinate. 

The above argument does not yet determine what the values are of the transverse 
coordinates y’ and z’ as measured in S’. It turns out that distances perpendicular to 0 
remain in fact the same in the two frames, so that 


as eee (8.3) 


In order to derive this result, consider two identical rigid rods, which are pointing into the 
same direction. One rod is at rest, and the other one is moving in a direction perpendicular 
to it such that at a certain moment the two rods will completely overlap. According to the 
above arguments, the situation will be the same when viewed in the other frame in which 
the first rod is moving and the second one is at rest. To each of the rods we have attached 
two needles at equal separation length, which leave two scratches on the other rod when 
passing. After the experiment the distance between the two scratches is compared to the 
distance between the needles. If the distances are not the same, then a moving rod is not of 
the same length as a rod at rest. This leads to a contradiction. According to one observer, 
the two rods must have overlapped at a certain moment, so that a certain section of one 
rod has overlapped a similar but shorter section of the other rod. On the other hand, 
according to the other observer, a slightly bigger section of the second rod has overlapped 
with the same section of the first rod, so that one must conclude that two points of the 
same rod must have overlapped as well. This is clearly impossible for a rigid rod, and 
we conclude that distances remain the same for observers moving at constant velocity in 
a direction perpendicular to the distance. Stated differently, the Lorentz contraction acts 
only on longitudinal, but not on transverse distances. 

We can write (8.2) and (8.3) in such a way that the formulae apply to general velocities 
v, which are not necessarily parallel to the x-axis, 


A = w(fi-#),  FAL=A, 
oF 


where we decomposed the position vector r in a vector 7 parallel to v and a vector Tr, 
perpendicular to v. We thus have the obvious identities r= rj +r and v = uj. Note the 
difference between the Lorentz and the Galilei transformation given in (1.1). 

The Lorentz transformation is of central importance for the theory of special relativity. 
From it we can directly rederive all the results obtained sofar. We illustrate this for the 
time dilation and the Lorentz-FitzGerald contraction in exercises 8.1 and 8.2. 


Exercise 8.1: Time dilation — Consider two events that take place at the origin of S’ at time 
t’ = 0 and t/ = r. According to the first equation in (8.2) these events occur at x = vt in S. 
Substitute this result into the second equation for t’ and derive that t’ and ¢ are linearly related 
by t’ = y,1t. Hence in S the two events take place at t = 0 and t = yy7T. Argue that this is 


28 


precisely in accord with (5.5). Thus the time interval measured in S is larger than rT. Note that 
the two events do not occur at the same place in S, but at « = 0 and x = yyuT, respectively. 


Exercise 8.2: Length contraction — To derive the Lorentz-FitzGerald contraction we take a rigid 
rod at rest in S’ of length 4. It is directed along the z’-axis with endpoints at x’ = 0 and 2’ = 4. 
Again we invoke (8.2) to calculate the corresponding points in S, but now measured at the same 
instant of time, say t = 0. Substituting t = 0 into (8.2) yields x’ = yx. Therefore the endpoints 
of the rod are at = 0 and a = y,!A. Hence the length of the rod measured in S is smaller and 
given by = 7, ‘A, in agreement with the formula for the length contraction. 


Finally we note an important and characteristic property of Lorentz transformation, 
namely that x2? + y? + 2? — c? t? remains invariant. More explicitly, we have 


ety 4 2? 2a Pt y4 Pee. (8.5) 


This result, which can be verified directly from (8.4), plays an important role later on. It 
is often used as the defining property of the Lorentz transformations. The relation (8.5) 
can be easily interpreted for the case that x? + y? + 2? — c?t? = 0, which is satisfied for a 
light signal emitted at t = 0 from the origin in some spatial direction. Because of (8.5) we 
find the same condition in the new coordinates, in accord with the fact that the velocity 
of light is the same in both frames. However, the spatial direction of the light ray will in 
general be different in the other frame. 


Exercise 8.3: Electron signals — A spacecraft leaves the earth with constant velocity v. It is not 
possible to maintain contact by means of radio or light signals. Instead one has to communicate 
via electrons which are sent by a small accelerator, one on earth and another one on the space 
craft. The electrons sent from earth have velocity w and to be able to reach the spacecraft, we 
must insist on w > v. 

At the moment of the launch the clocks on the earth and the spacecraft are synchronized. At time 
T, the first bunch of electrons is sent from earth. This bunch is received at time T’, measured on 
the clock of the spacecraft. 

Draw a spacetime diagram of the situation described above. 

Then determine the time t,, measured on the earth-bound clock, on which the electron signal 
reaches the space craft. Observe that the result is derived in the same way as (3.2) but takes a 


slightly different form, 
T 


(SS 
" 1-v/w 


Determine, by making use of the Lorentz transformation, the time T”’ of the arrival of the electron 
signal as measured on the clock on the spacecraft. Interpret the result, T’ = y,t,, as a time 
dilation effect. 

Immediately after receipt of the signal, the spacecraft sends a confirmation in the form of an 
electron signal from it’s own accelerator. This is detected on earth at time 


a 


pe ha ae 
1—v/w 


29 


Argue that a must be bigger than a certain cirital value from the fact that the return signal 
cannot move faster than light. Argue also that there is no maximal value for a. Denote the 
velocity of the electron signal as measured on earth by u and prove from vt, = (T” — t,)u that 
a=1+0v/u. Specify now the minimal value of a. 

Determine the velocity of the returning electrons expressed in the rest frame of the spacecraft. 


Use of (7.11) leads to the result 
P utvu 
u = ——. 
1+uv/c? 
Determine now qa assuming that the electron accelarators on earth and on the space draft are 
identical. Compare the result, a = [y2(1 —v/w)]~! with results obtained in section 3 in the limit 


that w — c. 


9 Space and time for accelerated observers 


When discussing the twin paradox, we concluded that the two twins were not in an equiv- 
alent situation, because one of them was subject to a sudden acceleration by which his 
velocity (relative to his brother) was reversed. This example leads us to suspect that accel- 
eration can give rise to rather unusual phenomena in a relativistic context. Now that we 
have the Lorentz transformation at our disposal we can study such phenomena in a more 
explicit fashion. 

The Lorentz transformation derived above allows us to convert the coordinates and the 
time associated with a single event measured in one inertial frame into the coordinates and 
the time measured for the same event in another inertial frame. Obviously this can also 
be done for a sequence of separate events. Hence we may consider a particle moving in 
space and described by an observer at rest in some inertial frame. Viewed as a function 
of the time t, the motion of the particle then defines a trajectory in a space-time diagram 
of the type considered before, according to which at each value of t a vector 7 is assigned 
corresponding to the position of the particle. Each point of the space-time trajectory, 
characterized by the value of t, thus defines an event. 

The space-time trajectory swept out by the particle is described by specifying its three 
coordinates (measured in a given inertial frame) as a function of the time t. These three 
coordinates, x(t), y(t) and z(t), comprise a vector 7(t), which depends on t, and its time 
derivative defines the velocity vector, again as a function of time. In principle we can 
also describe the motion of the particle as seen by an observer moving at a constant 
relative velocity with respect to the initial coordinate frame, by applying a corresponding 
Lorentz transformation to each point of the orbit (each characerized by a certain value of 
t) separately, thus reconstructing the trajectory as observed in the other frame, point by 
point. When straightforwardly applying the Lorentz transformations for the coordinates, 
this yields us the position vector 7’(t) in the new inertial frame, but still parametrized in 
terms of the original time variable t. However, for an observer at rest in the new frame, t 
is a rather irrelevant parameter as his clock measures the time t’. To obtain the space-time 
trajectory relevant to the new frame we must obtain 7” as a function of t’. This is not 


30 


difficult, at least not in principle. Each point of the original trajectory is parametrized by t 
and by means of the Lorentz transformation applied to the corresponding space-time point 
(event), each point yields a corresponding value for t’. To perform all these substitutions 
may be rather complicated in practice, but ultimately one obtains the particle coordinates 
r’ as functions of the time t’. 

Let us demonstrate this procedure and apply it to a specific trajectory. The purpose 
of the subsequent discussion is not to elaborate on the details of the calculations, which 
admittedly are somewhat involved. The calculations can be skipped at first reading. The 
specific example is presented to show how such results can be obtained in an explicit fashion, 
without relying on qualitative arguments. Our goal is to elucidate the rather important 
physical consequences of the result, which themselves do not depend on the details of the 
calculation. 

Consider a particle, which, for t < 7; = 0, is at rest in an inertial frame S and located 
at a position specified by the coordinates x = L and y = z = 0. At t = O the particle 
experiences a constant acceleration @ = (a,0,0) directed along the z-axis; y and z thus 
remain zero. At some later time, t = T) = v/a, the acceleration is switched off and the 
particle moves at a constant velocity ¢ = (v,0,0) directed along the z-axis. It is not 
difficult to specify the orbit followed by the particle as a function of the time t. Ignoring 
y and z coordinates, which remain zero, we have 


L fort <7, =0, 
1 
n= oer forT, <t< fh, (9.1) 
(eaee tor b> HOG: 
2a 


Before proceeding we make the following observations: 


e First of all, at this point we are just assuming that the particle is subject a constant 
acceleration a during a finite time interval. We do not (yet) know how to achieve 
this; more precisely, we do not know what the applied force should be in order that 
the particle is indeed experiencing a constant acceleration. While in Newtonian 
mechanics such a force must be constant, this is not so in relativistic mechanics; as 
we will discover in due time the force must be increased in time in order to sustain 
the constant acceleration. 


e Secondly, when the time interval during which the acceleration is applied or the accel- 
eration itself is sufficiently large, the velocity acquired by the particle may eventually 
exceed the velocity of light. Again we don’t know yet whether this will be practically 
possible, but we insist on excluding this situation by restricting the force to a finite 
interval such that the velocity will remain smaller than that of light. This implies 
that the force is only applied during a time interval smaller than At = c/|a|. As we 
shall see in a moment, when this condition is not respected rather strange phenomena 
will happen, which seem physically unacceptable. 


dl 


ct ct! 


cT» PRE Sears 


cT > x ~ a! 


(a) (b) 


Figure 7: The trajectory swept out by the motion of a particle as described in the text. 
The diagrams (a) and (b) show the trajectory as seen by observers at rest in the inertial 
frame S and S’, respectively. The times 7; and 7) in the first frame and 77 and T3 in the 
second frame refer to the time at which the acceleration is switched on and off, repectively. 
In the diagrams, we chose positive values for L, v and a. 


e Thirdly, we impose a constant acceleration as measured in the inertial frame S. In 
other words, the velocity of the particle as measured in S increases linearly in the 
time relevant to that frame. However, other observers moving at constant relative 
velocity with respect to S will not agree with that statement. Furthermore one 
can also determine the acceleration in the instanteneous rest frame of the particle (in 
other words, the velocity acquired per unit of proper time) and again this acceleration 
is not constant. 


We return to some of these observations shortly, so let us now proceed and decribe the 
space-time trajectory in some other inertial frame. We choose a second inertial frame S’, 
moving at a constant velocity v, the same velocity as acquired by the particle after the 
acceleration process, using the synchronized clocks in the usual fashion, so that the origins 
of the two frames S and S’ coincide at t = t/ = 0. According to an observer at rest in 
S’, the particle will initially move with velocity —v, then it will experience a deceleration 
during some finite time interval T] < t’ < T}, after which it will come to rest and stay that 
way. The trajectories of the particle seen by an observer at rest in S and by an observer 
at rest in S’, are shown in figure 7. 

Let us now apply the Lorentz transformation to a given point on the particle trajectory, 
characterized by a time t, which defines a single event. Application of (8.4) to (9.1) leads 


32 


directly to the coordinate 2’ (y' and z’ remain zero and are ignored henceforth) 


VL yyut fort a =o, 
1 
x(t) = seam swat? forT, <t<T, (9.2) 
(b=) fort. ToS v/a. 


In order to obtain the trajectory as seen by an observer located at the origin of S’, we 
must determine x’ as a function of the appropriate time t’. Therefore we first determine 
t’ for the same point of the trajectory characterized by t, which follows also directly from 
combining (8.4) and (9.1), 


win ae k for F< = 05 
— Yu U 1 2 
=) wt--5 (L+at?) forT,<t<t, (9.3) 
2 
ote (Das. ) fort lo =O) a 


By substituting t = 7, and t = TJ) in the expressions above, we find the corresponding 
values for t/ at which the acceleration is switched on and off. They are denoted by Ty and 
TS, respectively, and equal to 

L, T; a om (Y + ae) > am L, (9.4) 
and thus TS —T] = 3(yw+7,')(T2—T1), independent of L. Observe that this implies that 
BT SAT 

Before expressing x’ in terms of t’, we would first like to draw the attention to a number 
of interesting features in the relation (9.3) between t’ and t, also shown in figure 9.2. As 
expected t’ depends linearly on t when no acceleration is applied. It is easy to understand 
the two slopes, independent of the precise details of the situation, which are equal to 7, 
and y;' for t < T, and t > Th, respectively. This is precisely in accord with the time 
dilation effect: for t < T; the particle is at rest in S, so that its proper time equals t, while 
for t > T the particle is at rest in S’, so that its proper time equals t’ (up to irrelevant 
additive constants). 

However, for T, < t < Ty the time t’ does not depend linearly on t, due to the accel- 
eration, and in this time interval neither ¢ nor t’ are directly related to the proper time 
of the particle. A priori it is not guaranteed that ¢’ rises monotonically as a function of 
t. If at some point t’ would start to decrease, the particle trajectory would start running 
backwards in time! This strange phenomenon, which would obviously violate the notion of 
causality, would occur when dt’/dt becomes negative. Inspection shows that this is only 
the case when at > c?/v, implying that the velocity v obtained after switching off the 
acceleration will exceed the velocity of light. This phenomenon was alluded to earlier. 

The reader may get somewhat worried when comparing the two graphs in Fig 9.1. For 
an observer at the origin in S’ the velocity changes at negative time. Could this observer 


You UV 


T! = 
1 
c? 


33 


T 


Figure 8: The time t’ measured along the particle trajectory as a function of the time t 
measured along the same trajectory. Again we took positive values for L, v and a. 


not contact his colleague who is at rest in the origin of S and inform him ahead of time 
of the acceleration that is about to be applied to the particle? If this were the case, the 
special theory of relativity would not respect causality! However, the theory does respect 
causality and the reader may easily verify that the information about the deceleration of 
the particle will never reach the first observer for negative times t’, at least not when the 
information travels at a speed of at most the velocity of light. 

One may also wish to determine the proper time of the particle along the trajectory. 
This is the time measured by a clock that is attached to the particle (i.e., in its instan- 
taneous rest frame). We already showed that, before switching on or after switching off 
the acceleration, proper-time intervals coincide with time-intervals measured in S and S’, 
respectively. To obtain the lapse of proper time for the time interval during which the 
acceleration is applied, we must realize that for given t the velocity equals at. Therefore, 
an infinitesimal increase of the the proper time is reduced by the gamma factor associated 
whith this velocity. More precisely, when the time increases by an amount At, the proper 


time is increased only by 
Get 


It thus follows that a proper-time interval will always be shorter than the corresponding 
time interval measured in any other inertial frame. As discussed in exercise 9.1, to obtain 
a finite proper-time interval, we must integrate (9.5) along the spacetime trajectory. 


Exercise 9.1: Proper time — To obtain the proper time elapsed after the acceleration was switched 


34 


on, we must integrate (9.5) from t = T; = 0 to t < T>. The result is more general than for the 
example discussed in this section and the proper time can always be determined by integration 
along a trajectory in spacetime (c.f.(15.5)). In this particular case we can derive the following 


result, 
| 2 ait 
T-T1= if dé 4/1 — “- — 7 aresin os aks st 4/1 (9.6) 


Observe that this equation does not ne on _ unlike some of the previous expressions for 
time variables. This is entirely reasonable, because for two observers being equally accelerated 
at a different position, the proper time should evolve identically. Obviously the above equation 
makes sense only when at < c, implying once again that the final velocity must remain smaller 
than c. When the acceleration is switched off at t = T> = v/a the proper time interval between 
switching on and off the acceleration is thus equal to 


a Pel aia oe Reet 
Th ee F arcsin — +; : (9.7) 


Finally let us now present x’ as a function of t’. While this is trivial for t < 7; and 
t > To, the calculation is somewhat laborious for the interval T, < t < 75, because the 
relation between t’ and t¢ is not linear. One way to evaluate x’(t’) is to observe that the 
trajectory is part of a parabolic curve in the (2’, t’)-plane. To see this, one may apply the 
Lorentz transformation backwards on 7 = L + Sat? , so that one obtains 


ya’ + ut) = b+ lav! tua! /c?)’. (9.8) 


This yields two possible solutions for x’, of which only one is relevant. The combined result 
then takes the following form 


y,'L—vt! fore la 
ct au GU YoU 
BE) = 4 aw f oa Ywoat = ‘a = 2%) + “a J) for T7 << TZ, (9.9) 
2 
Yw(L — =) ford =. Ts 


As we stressed before, we give these results mainly to demonstrate how the calculation is 
performed. The qualitative features, which we discuss below, do not sensitively depend on 
all the details of the results above, 

Let us now turn to these qualitative features. At first sight the observation of the 
particle trajectory in the two inertial frames seems in agreement with what one intuitively 
expects, except that there are some relativistic correction factors here and there. However, 
closer inspection reveals that some of these correction factors lead to rather amazing effects. 
To appreciate the significance of these effects let us describe them in a practical situation 
and return to the twins, Caretaker and Adventurer (before Adventurer’s journey described 
in section 5, so that they still have the same age). This time they board an elevator, 


35 


but while Caretaker decides to stand on the floor of the elevator, as normal persons do, 
Adventurer decides to climb up and sit on the roof of the elevator. Once they have taken 
their positions, the elevator starts moving upwards. (Of course, a suspicious person is free 
to compare once more their age, immediately prior to taking off.) In line with our previous 
discussion we assume that the acceleration is constant during some time interval and is 
subsequently switched off, so that the elevator moves at a constant velocity v. Somewhat 
to their surprise, a little later the doors open at some other floor and the twins are able 
to leave. Of course, this implies that the higher floor moves with constant velocity v with 
respect to the lower one. This should not concern us here. At any rate, the twins are not 
able to verify this directly, as there are no windows on this floor. Thus they conclude that 
the elevator has probably been brought back to rest very gently, as they did not notice any 
effect due to the slowing down. 

The space-time trajectories of the twins described in the two inertial frames S and S’ 
attached to the lower and the higher floor, respectively, are shown in figure 9, where x and 
x’ denote the coordinates in the vertical direction. The trajectories are the same as the 
trajectories depicted in figure 7; they follow from the previous formulae, except that the 
parameter L, which measures the height in S when boarding the elevator, is different for 
Caretaker and Adventurer and will be denoted by Le and Ly, respectively. Consequently 
the height of the elevator is equal to h = L4 — Lc. Thus Caretaker and Adventurer are a 
distance h apart after boarding the elevator. 

Let us view the situation from the point of view of the two reference frames associated 
with the two floors. In the coordinate frame S attached to the lower floor the situation is 
as expected. Both twins move up with the same velocity and at a fixed relative height equal 
to h. The proper time of the two twins (i.e., their age) is given by the same expression 
(9.6) as derived before. Equal time t thus corresponds to the same proper time T. 

The qualititive features of the situation are drastically different when viewed in S’, the 
reference frame of the upper floor. Here one sees the twins approaching this floor with 
velocity v. They both slow down and ultimately come to rest at a certain height measured 
with respect to the upper floor coordinate frame. However, the twins do not come to rest 
at the same time! The velocity changes for Caretaker are delayed. In particular, as follows 
from (9.3) and (9.4), the acceleration of Caretaker is switched on and off at a later time 
than for Adventurer. The delay time is equal to At! = y,vh/c?. However, according to 
the observer associated with the lower floor, as well as according to the clocks carried by 
the twins (which measure their proper time), the velocity changes happen at the same 
time. This implies that, after arrival on the upper floor, Adventurer has become older 
than Caretaker and their age difference is equal to y,uh/c?. Another way of expressing 
this is by saying that the time apparently runs slower for Caretaker! Observe that, when 
the acceleration is only switched on during a small time interval AT (measured in S), then 
the age difference can be approximated by 


At!» AT. (9.10) 


36 


ct ce 
A A 
C A s! 
vh 
cT4 
ae 
CE etaeaytabaaes } Byh 
cl; > 2x ~ x! 
kK§$Lio h 
ele 
cli 4 


Figure 9: The space-time trajectory of the two twins Caretaker and Adventurer, who have 
boarded the same elevator but have taken positions at different heights. The coordinates 
x and 2’ refer to the vertical direction. 


This formula was first derived by Einstein in 1907.4 

However, the difference in age is not the only effect of the acceleration. As seen in 9’, 
the difference in height between Caretaker and Adventurer is not constant, but increases 
in time. Once both twins have come to rest, their relative distance has increased to 
Yh. Already before we noticed that distances depend on the frame in which they are 
measured, as is illustrated by the Lorentz-FitzGerald contraction discussed in section 6. 
Without specifying the velocity of the observer who performs the measurement, the result 
of a distance measurement is of no value. Therefore it is best to always refer to length 
measurements done in the rest frame. However, when the twins are being accelerated, 
it is not possible at any time to establish an inertial frame in which they are at rest 
simultaneously. For the twins, we can compare their relative distance in the rest frame 
some time before and some time after the force is applied. The result of this measurement 
is unambiguous and indicates that their relative distance has increased. The phenomenon 
is therefore a true physical effect, not something that depends on the properties of the 
observer who performs the measurements. We should also stress that the time lag and the 


114. Einstein, Jahrb. Rad. Elektr. 4 (1907) 411. 


37 


change of the relative distance is not related to the details of the journey, as we will show 
in excercise 9.2. 


Exercise 9.2: Relative distance and acceleration — To show that the change of the relative 
distance is not related to the details of the journey, we compare two inertial frames, S and S’, 
with respect to which the twins are at rest before and after the forces are applied. According to 
an observer at rest in S, the relative distance between the twins is constant and equal to h. This 
implies that in S, the acceleration of the twins is always the same at every instant in time. Before 
the twins are accelerated, an observer at rest in S’ measures a shorter distance between the twins 
than an observer at rest in S, because of the Lorentz contraction. At the end, when no forces 
are applied anymore and the twins are at rest in the frame S’, the relative distance as measured 
by the observer in S must be the Lorentz contracted result of the distance measured in the rest 
frame S’. Therefore it follows that the actual distance measured in the new rest frame must have 
increased after the journey by a factor y,. Argue that the change in the relative distance is thus 
independent of the details of the journey and of the time dependence of the applied forces. 


Exercise 9.3: Time lag and acceleration — The time lag does not depend on the details of the 
journey either. To see this, prove that events which take place at the same time ¢ but at different 
positions at a fixed relative distance h in the frame S are seen by an observer at rest in the 
frame S’ with a time separation At’ = Gyh/c. Consequently, the twins moving at fixed relative 
distance in S always give rise to the same time difference At’ in S’, irrespective of the details of 
the trajectory in spacetime. 


One may again try to appeal to some sort of symmetry between the two twins in order 
to argue that the above effects cannot possibly arise. However, the symmetry between the 
twins is affected by the fact that the force responsible for the acceleration has a certain 
direction; in this case it is directed upwards, from Caretaker to Adventurer. Therefore 
the initial symmetry is lost. There is another experiment, which hinges directly on the 
underlying postulates of special relativity, that shows that the symmetry between the twins 
is lost. Suppose that Caretaker sends a light beam towards Adventurer with frequency v 
and let us assume that the relative velocity of the twins is zero (as is the case in the inertial 
frame S associated with the lower floor). It takes the light approximately h/c seconds to 
reach Adventurer, so that by that time Adventurer’s velocity will have increased by 


Ava & oi (9.11) 
c 


as a result of the acceleration. Consequently, Adventurer will observe a lower frequency as 
a result of the Doppler shift, 


(9.12) 


Conversely, when Adventurer sends light beams to Caretaker, the frequency observed by 
the latter will increase by the same amount. Hence 


h 


ve mv (1+ =) (9.13) 


38 


Thus this experiment clearly demonstrates the lack of symmetry between the twins during 
the time the acceleration is experienced. 


10 ‘Transformation of velocities 


As explained in the previous section, a particle sweeps out a certain trajectory in a space- 
time diagram, defined by its position vector r(t), measured in some inertial frame S as a 
function of the corresponding time t. Its velocity vector is then equal to 
ii(t) = dr(t) . 
dt 

In some other inertial frame S’, the particle sweeps out a trajectory defined in terms of 
r'(t'), the position of the particle measured in the new frame as a function of the time ¢’ 
relevant to that frame. In the previous section we demonstrated how to obtain 7’(t’). In 
this section we want to generally determine how the velocity of the particle changes from 
one inertial frame to another. To do so, let us consider two neighbouring points (events) on 
the trajectory separated by a small time increment At with corresponding position vectors 
r(t) and r(t + At). In S’ these two events correspond to 7r’(t’) and 7’(t’ + At’) with At’ 
the corresponding time increment in S’. The coordinates and the times in the two frames 
are related by the Lorentz transformation (8.4), where U is the relative velocity of the two 
frames, whose clocks have been synchronized at t = t’ = 0 in the usual way. 

The space coordinates and the time measured for each of the events in the two frames, 
are related by the Lorentz transformation. Hence, for the first event we have 


(10.1) 


rt’) = wlFi@)-s), Fe) =F, 
! v- r(t) 
t= % (: a (10.2) 
and, for the second event, 
r+ At) = wlA(e+At)-ot+A), Fe + At) =F + Ad), 
; ; U-r(t+ At 
t+At! = yw (: AGS i ) : (10.3) 


When At is sufficiently small we can write r(t + At) = r(t) + Atu(t). Therefore it follows 
from (10.2) and (10.3) that 


nd 


At! & yy ( tO “) At, (10.4) 


so that At’ is proportional to At, as expected. Likewise we obtain (10.2) and (10.3), 
F(t + At!) — Fy) & Ww (a(t) — 8) At, 
ri(t’ + At’) -—7(t) u(t) At, (10.5) 


2 


39 


Using the definition of the velocity vector in the S’ frame (cf. (10.1)), 


3 dr’(t’) 
/ t! — 
pate, dt!” 


we can approximate the left-hand side of (10.5) by At’ da’(t’). After dividing by At’ and 
using (10.4) we thus determine w’(t’) 

Before giving the result we note that we could have derived the same result directly by 
differentiating (10.2). This leads to 


dri(t) ‘ ( - 


(10.6) 


dt 
dr’ (t’ dr’, (t 2 
We) MO) La, (10.7) 
and 4 () 
t u(t) -v 
—_—=%y |1- : 10. 
ra (1- ) (10.8) 


On the other hand we may use the chain rule and write 


d"(t') 7 () dr"(t’) a ( e a) a(t), (10.9) 


dt dt dt’ Ce 


where we made use of (10.8) and (10.6). 
Combining the above results, we find the following transformation rule for the velocities 


ay = 
| 


a = —. 


As an example let us first apply (10.10) to the case where w@ and @ are parallel. Then 
u, = 0 and the first equation (10.10) coincides precisely with our previous result for the 
addition of parallel velocities. Observe that when i and tw’ are not constant, they refer to 
the velocities taken at corresponding points of the space-time trajectory. 

By means of this result we can determine how tw? changes under Lorentz transforma- 
tions. Rather than w? we are interested in the change of the corresponding gamma factor, 


(10.10) 


np (10.11) 


40 


which is a result that we will need shortly. From (10.10) it follows that 


ao 
Yul = Yu Ye (1 ae ; (10.12) 


The proof of this goes as follows. For convenience we asssume that U is directed along the 
positive x-axis. Making use of (10.10) we first calculate aw’, 


> ~\ -2 2 2 
. u 
ges (1 = a f(s =p) e ae ; (10.13) 
Subsequently we substitute 
us t+u; =u? —u?, and uz = ee (10.14) 


U 


and write (10.13) in terms of the length of @ and v and their inner product. Then we verify 
that the following equation holds 


a? aeo\~? ai? ve 
(eee), tis 


From this result we obtain (10.12). In order to check the validity of (10.12) we subsitute 
u =v. This leads to the expected result y, = 1. 


11 Energy and momentum 


Previously we found that the addition of two velocities smaller than c always gives a 
combined velocity that is still smaller than c. This result means that a particle moving 
at some velocity v < c will move at a velocity smaller than c in any inertial frame which 
itself moves at a relative velocity smaller than c. Therefore it appears that one is led to 
the conclusion that the velocity of a material body can never be equal to or larger than the 
velocity of light. On the other hand, by applying a constant force on a particle, its velocity 
is expected to increase indefinitely, and it is not clear a priori why it cannot acquire just any 
velocity by applying the force for a sufficiently long period of time. Similarly, one expects 
that the momentum and the energy of a particle will increase indefinitely by applying some 
constant force on it. 

In order to understand these issues we must reexamine how the laws of physics that 
apply to low velocities are modified at velocities comparable to the speed of light. As it 
turns out, the value of the momentum and the energy of a particle can increase indefinitely 
but the relation between the energy, the momentum and the velocity of a particle is such 
that the velocity is restricted to take values smaller than the velocity of light. In this section 
we derive the relativistic expressions for the momentum and the energy of a (pointlike) 
particle with mass m and velocity wu. We will do this by considering an elastic collision 
between two such particles in two different inertial frames. 


Al 


From standard Newtonian mechanics we know that the motion of these particles is 
characterized entirely by their momentum and energy, which are expressed in terms of the 
velocity vector uw and the mass m by 


p=mi, E= 5mi?=—_. (11.1) 


As indicated above these expressions are no longer applicable for relativistic velocities. 
In order to find the correct definitions we parametrize the momentum and energy as follows, 


p=mf(m)t E=me'g(), (11.2) 


where we made use of dimensional arguments, while f and g are two, yet unknown, di- 
mensionless functions of 


‘y= ——. (11.3) 


The functions f and g depend therefore only on the magnitude of the velocity, and not 
of its direction. Consequently, the momentum and the velocity are proportional to each 
other (i.e. p || @) with a proportionality factor that depends only on the magnitude of 
the velocity. Also the energy depends on the magnitude, but not on the direction of the 
velocity. Hence the definition of momentum and energy do not depend on the spatial 
orientation. 

We now require that, in every inertial frame, Newton’s second law is valid, 


PS (11.4) 


|. 


where F is the force applied to the particle in a given inertial frame. From this it follows 
that (in every inertial frame) the total momentum does not change in a collision of two 
particles, i.e., 

Pit po =p3t Pa, (11.5) 


where p; and py denote the momenta of the incoming, and p3 and p, the momenta of the 
outgoing particles. Furthermore we assume that the collision is elastic, so that we have 
conservation of energy, 

BE, + EF. = F3+ E,. (11.6) 


Our aim is to determine the functions f and g from the requirement that the collision 
process is described correctly in every inertial frame by the conservation laws of energy and 
momentum. To that order we consider two colliding equal-mass particles in the center-of- 
mass frame S, so that their velocities satisfy the relation vw; = —v2 = wu. After the collision 
the direction of motion of the particles is rotated over an angle of 90°, with velocities 
W, = —Wy = Ww. Owing to energy conservation the velocities wen w must be of equal 
magnitude: u= w. 


42 


S 
= 


Figure 10: Elastic collision of two equal-mass particles viewed in two different inertial 
frames. In the figure the velocity vector v is directed along the vector Wo. 


Consider now the same collision, but now by an observer at rest in an inertial frame 
S’ that moves with a velocity U with respect to S. We choose U parallel to w, and wy (see 
figure 10). The velocities of the incoming and outgoing particles in S’ are then given by 
(we make use of (10.10)) 


u —i 
or oy sp =" or sp = 
Uy = Uy, + Uy = — 8, Q = Ug, T Ug] = ——— 2, (11.7) 
You You 
w— 9 —o-— 
>of =>) Fi eee Geils oe 
WH Wy + Wy = Goat | a oS aa (11.8) 
1— 2 2 
C 


Let us first verify that the classical definition (11.1) for the momentum no longer leads to 
momentum conservation in S’. In that case the total momentum of the incoming particles 
is equal to 

py + Dy = mui, + mi, = —2md, (11.9) 
whereas the total momentum of the outgoing particles is given by 


mu—-mv —mw—mv 


—/ Sf =>/ A an 
P3 + Py = MW, + MW, = aaa | ra (11.10) 
— 1+ 
C C 


It is convenient to use the parametrization w= 7 Vv, with 7 some constant (taken negative 
in figure 11.1). Then the previous result can be written as 


Le 1 
a ee i. (11.11) 


— a 
P31 Py = mé | ae 1+nv2/2 


43 


Now one can easily verify that pj + p3 and p3 + pi are only equal if v? = c? or if w or 
vanish. Hence we can have no momentum conservation except in extreme cases. 

Let us once more follow the same reasoning, but now assuming that the momentum 
vector is defined by (11.2). First we recall that y,, changes under a Lorentz transformation 
associated with an arbitrary velocity v according to 


UU 
Yul = Yuu ( a C2 : (11.12) 


This result is given in (10.12). De total momentum of the incoming particles is now equal 
to 


Py Ts Ps 7 mf (Yut) uy + mf (Yut,) ws 
7 ~it 


mt Vie) (= [ ¥ “) 7 md Vue) ( “Vy a “) 


UV 


whereas the total momentum of the outgoing particles is given by 


>| 


P3+P, = mf (yw) B+ mf (yw) te 
wv. Ww 


)) 


= mf (Yu (1 = ee 


where we made use of y, = Yw. Momentum conservation in S’ implies that (11.13) and 
(11.14) should be equal, irrespective of the magnitude of the vector v. This means that 
the unknown function f must satisfy the relation 


—2f (Yu Ww)t = deren: = — 


)) 


C2 


for any velocity Uv parallel to w. 

An obvious solution of (11.15) is given by f(y) = Cy with C some arbitrary constant. 
In fact this turns out to be the most general solution as well!*. In order to obtain agreement 
for small velocities with the classical definition of the momentum (cf. (10.1) we choose 
C' = 1, so that 


Sl 
| 
3 

23 

< 
~ 


4/1 — u?/c? 


!2Tn order to prove this, show that f(y) satisfies the differential equation f’ = y~!f by differentiating 
(11.15) with respect to v and putting v = 0 afterwards. The general solution of this first-order differential 
equation is f(y) = Cy. 


44 


vs mit {1+ O((u/c)*)} . (11.16) 


The derivation of the relativistic expression for the energy proceeds in a similar way. 
According to (11.2), the energy of the two incoming particles is equal to 


E, +E, = me*g(qw) + me*g(qus) 
= 2Wmeg(Yuv) » (11.17) 


while the total energy of the outgoing particles is given by 


Ey + Ey = me?g(qw,) + megs) 


w-v O-v 
= me’g(yuw(1 — a )) + meg(ywyo(1 + a pe (11.18) 
Energy conservation in S’ implies therefore that 
wO-v O-v 
29 (Yu) = Gell — —-)) + 9 WIL + —))- (11.19) 


The general solution of this equation is g(y) = Cyy + Co, with C, and C, two arbitrary 
constants!?. We choose the constant C, = 1 in order to obtain agreement with the classical 
expression for the kinetic energy (cf. 10.1). To verify this, note that the energy acquires 
the following form, 


mc? 


4/1 — u?/c? 


E=me'y, + constant = + constant . (11.20) 


For small velocities, EF becomes equal to 
E = constant + }mi? {1 + O((u/c)?)} ; (11.21) 


which does indeed yield the classical expression for the kinetic energy, smi 2 up to an 
additive constant. We shall choose the constant C2 equal to 0, so that 


E = me, 
me 


jl —u?/c? 


me + imi? {1 + O((u/e)”)}. (1099) 


2 


In this way we find a rest energy Eo of a particle equal to 


Ey = mc’. (11.23) 


13From (11.19) one derives that g satisfies the differential equation g” = 0 by differentiating twice with 
respect to v and putting v = 0 afterwards. This second-order differential equation has the general solution 
gy) = Civ t+ Cr. 


45 


This choice for the rest energy does not follow from the study of elastic collisions, but it 
can be justified from decay processes where for instance an unstable particle with mass 
M decays into two particles of smaller mass m. It also follows from the general argument 
given below. 

Under a Lorentz transformation the momentum and energy as found in (11.15) en 
(11.21), change in a way that is similar to the transformation of space and time. This 
can directly be verified by using (10.10) and (10.12). For a Lorentz transformation with 
velocity v we find 


ps zie =. 1 Bos 28 
Pi = (A = <5) > PL=P., 
EB = Yu (E Be i) ) (11.24) 


where pj} and p,; are the components of the momentum vector parallel and perpendicular to 
v. When comparing the transformation (11.23) with the Lorentz transformation as derived 
previously, we observe that p and E/c transform in the same way as 7 and ct. Because 
the transformation (11.24) is linear we know that a linear combination of momenta and 
corresponding energies that is equal to zero in one inertial frame (such as, for instance, 
in (11.5) en (11.6)), will be equal to zero in any other inertial frame. In other words, 
the expressions for the momentum and energy that we have found by considering one 
particular collision process in two different frames, will also give the correct description for 
more general collisions processes. 
Finally we observe that EF and p satisfy the Lorentz invariant condition 


p-—>=-m'e’, (11.25) 


as follows from the definitions (11.15) and (11.21). Because the energy is positive we find 


the expression 
E=c/p?+mc?. (11.26) 


An alternative expression which is often convenient, is 
E=\/p2e2+ ke, (11.27) 


where Eo is the rest energy defined in (11.23). 

We can also verify the correctness of the relativistic expression for energy in another 
way. In a given inertial frame the work done by a force on a particle per unit time is equal 
= dE dF 

= rT > 
ie tak (11.28) 
where w is the instantaneous velocity of the particle. Using Newton’s law (11.4) then gives 
rise to 


dE 
dt dt 


> 


46 


2 
iat (11.29) 


dt fi-we 


This equation is in agreement with the result found in (11.21). 


12 Particles with zero mass 


A particle with energy & and momentum 7 has a velocity equal to 

oma 

Ee 

Consequently, for particles that move with the velocity of light, one has 


E=pe. (12.2) 


By means of (11.25) we then derive that such particles have no mass. This is quite a 
new phenomenon, which we did not consider until now, because we always assumed that 
material objects have a velocity smaller than that of light. Some of the formulae that 
we have derived so far are no longer applicable, such as, for instance, (11.15) and (11.21), 
because for massless particles we have m = 0 and y = co. Because these particles propagate 
with the speed of light, which does not change under a Lorentz transformation, there is no 
observer for which they are at rest. 

The above arguments indicate that a particle interpretation of light can be consistent 
with the theory of relativity. Indeed we speak of ” photons”, the particles associated with 
light. Other massless particles in Nature are the so-called ” gravitons”, the particles asso- 
ciated with the gravitational field, and ”neutrinos”, particles that appear in radioactive 
”beta” decay. 

Let us examine how the energy of a massless particle changes under a Lorentz trans- 
formation. With help of (11.23) we derive 


B= et = UD) 


i= (12.1) 


= Ww 1l- BE 
_ fe 
= 1+B E, (12.2) 


where we assumed that U and pare parallel and we used (12.2). The energy thus transforms 
in the same way as the frequency of a light wave according to the relativistic (longitudinal) 
Doppler effekt (cf. 4.2). This is consistent with Planck’s equation, according to which the 
energy and the frequency v of a photon are proportional, 


B=hyvp. (12.4) 


The proportionality constant h is called Planck’s constant. 


AT 


13. The concept of mass 


We already used Newton’s second law 


+ dp 

F= 7, (13.1) 
which relates the applied force to the change of momentum per unit time. This form of 
Newton’s law, which we assumed valid in any inertial frame, thus takes the same form as 
in nonrelativistic classical mechanics. However, its consequences are rather different, as we 
should now use the relativistic definition of the momentum vector. To examine this, just 
substitute (11.15) into (13.1), 


dt \ \/1 — u2/e 


2 3 u 
= Mud, +may (Ye FG) ‘ (13.0) 
where @ is the acceleration ag 
d= = G1 +a, (13.3) 


Contrary to the classical situation, the acceleration and the force are no longer parallel. 
Decomposing (13.2) in vectors parallel and perpendicular to the velocity w, (13.2) takes 
the form 


ia 


> 


= 3.> 
= MY, Q||, 


Usually the ratio between the force and the acceleration is called the inertial mass. Using 
the same terminology we thus conclude that the inertial mass of a particle is not defined. 
It is, however, possible to introduce a longitudinal and a transverse inertial mass based on 
(13.2), which are equal to 


—_ = 3 
Mtransverse = 12 Yu; Mongitudinal =m - (13.5) 


Note that both inertial masses tend to infinity when the velocity approaches the velocity 
of light. This is the reason that we can never accelerate particles to or beyond the velocity 
of light. 

At this point we have thus encountered three types of masses already. The mass denoted 
by m, which is sometimes called rest mass, is the mass that we are used to in classical 
physics. It does not depend on the velocity of the particle, but is a material constant which 
depends only on the type and the amount of matter of which the particle is composed. 
Then we have two types of inertial masses as defined in (13.5), which do depend on the 


48 


velocity of the particle. To make matters even more confusing, we sometimes use the so- 
called invariant mass for systems consisting of several particles moving at some relative 
velocities. The notion of invariant mass will be explained later in the second part of these 
lectures. 

Then finally, one uses the term gravitational mass for the masses that enter Newton’s 
law for the gravitational force between two bodies, 


> 


kasd if 
Fay = —Gyn My, M2 3? (13.6) 


where Gy is Newton’s constant, equal to 6.7 x 107!! m?kg~! sec~? and r is the distance 
between the two bodies. However, this expression is incorrect in the relativistic context, 
and there is no such thing as gravitational mass. The correct expression follows from 
the theory of general relativity. According to this theory, a body of small mass travelling 
at relativistic velocity with energy F and velocity v = ci in the gravitational field of a 
(spherically symmetric) object with very large mass M, feels a force 


E (1+ 8?)r—- (8-7) 6 


C2 r3 2 


Fray = —Gy M (13.7) 
in the rest frame of the heavy object. For small @ this result reduces to (13.6). When 
G x 1, the force is not directed along 7. The classical notion of gravitational mass is 
thus undefined, although one may introduce a longitudinal and a transverse gravitational 
mass, just as we did above for the inertial mass. Note that (13.7) is applicable for massless 
particles, such as photons, in the gravitational field of a heavy object, such as the sun. 
The fact that massless particles feel a gravitational force shows clearly that the ordinary 
mass and the gravitational mass are intrinsically unrelated quantities!*. 

The question does remain, however, why the actual value for the inertial and the 
gravitational mass is the same for a body at rest. This question has motivated many 
experiments. The oldest one is by Galileo, who discovered that bodies fall at a velocity 
that is independent of their mass. Newton also considered the possibility that the inertial 
and gravitational masses are not quite equal. Then, at the end of the nineteenth century, 
Eotvos performed a series of famous experiments, comparing the centripetal acceleration 
due to the earth’s rotation to the gravitational acceleration of the earth, which allowed 
him to show that the ratio of the inertial and the gravitational mass is universal: up to 
10~° he found the same ratio for all materials that he investigated. More modern methods 
employ the gravitational field from the sun and the earth’s rotation around the sun and 
arrive at the same conclusion, but with an even higher degree of accuracy. 

The equality of the inertial and gravitational masses led Einstein to formulate the 
equivalence principle, which formed the starting point for his construction of the theory of 
general relativity’. Einstein noted that an observer in free fall does not note a gravitational 


M4For a discussion of the various notions of mass and the unnecessary confusion caused by certain 
textbook treatments, see L.B. Okun, Physics Today, June 1989 and May 1990 issues. 
154. Einstein, Jahrb. Rad. Elektr. 4 (1907) 411; Ann. der Physik 35 (1911) 898, 38 (1912) 355. 


49 


field. Because of the equality of the inertial and gravitational masses, all objects around 
him will stay at rest or will move with constant velocity. Observe that this is typical for 
gravitational forces. For instance the orbits of charged particles subject to electromagnetic 
forces always depend on the ratio of their electric charge and their inertial mass. So when 
an observer is at rest relative to some charged particle, he does not observe the same 
phenomenon as other particles with a different charge-mass ratio will still be accelerated. 

According to the equivalence principle, it is always possible to choose — at least locally 
—an inertial frame in which the theory of special relativity holds. In Einstein’s words, the 
observer has the right to consider his state as one at rest and his immediate environment 
as field-free relative to gravitation. 

The equivalence principle has far reaching consequences. It implies that many of the the 
phenomena that can happen in a gravitational field, can equally well be described in terms 
of accelerated observers without a gravitational field present, a situation we discussed in 
section 9. For instance, from the discussion at the end of section 9, we may conclude that 
the time runs slower for an observer “deeper in a gravitational potential”. In that context 
the frequency shift (9.12) corresponds to the so-called gravitational red shift, provided the 
acceleration a is identified with the gravitational acceleration g. Unfortunately, a further 
discussion of the equivalence principle is beyond the scope of these lectures, which deal 
with the special theory of relativity. Nevertheless, it should be obvious from the above 
considerations that gravity has important implications for space and time! 


50 


PART II 


51 


14 The four-dimensional world 


Events that take place at a certain point in space and at a certain time can formally be 
described as points in a four-dimensional vector space. This space is called the Minkowski 
space. Its coordinates are denoted by x with py = 1, 2,3 and 0. The first three coordinates 
denote the usual space coordinates, say x, y and z, whereas x° is defined by 


x’ =ct, (14.1) 


with c the velocity of light. Vectors in Minkowski space are called four-vectors. 

Every particle follows a trajectory in Minkowski space. This trajectory describes the 
time evolution of the particle and is called the world line. For instance, the line x! = 
x? = x* = 0 describes a particle at rest in the origin of the coordinate frame, whereas the 
line defined by x! = Gx°, x? = x? = 0 describes a particle moving along the z-axis with 
constant velocity v = Gc, which at time t = 0 passes through the origin. 

Light signals move at straight lines under an angle of 45 degrees with the x°-axis. Light 
signals passing through the origin have coordinates that satisfy the equation 


pty teoer =0, or GY rele a-@y =o. (14.2) 


The surface defined by (14.2) is called the light cone. 

Obviously there is some arbitrariness in the choice of coordinates of the Minkowski 
space. One may, for instance, choose a coordinate frame that is related to the previous 
one by a rotation of the coordinate axes associated with x', x? and x°. However, one 
may also describe the events in a different inertial frame, so that the space and the time 
coordinates are related to the previous ones by a Lorentz transformation. This Lorentz 
transformation may be such that the origin of Minkowski space remains the same. This 
means that the time measured in the two inertial frames, denoted by t and t’, are both 
equal to zero when the origins of the two spatial coordinate frames coincide. In general 
Lorentz transformations transform straight lines into straight lines. This is so because the 
Lorentz transformations act linearly on the time and the spatial coordinates. This implies 
that particles moving at constant velocity in one inertial frame, move at constant velocity 
in any other frame. Furthermore points on the light cone remain on the light cone after a 
Lorentz transformation, because the condition (14.2) is Lorentz invariant. 

As Lorentz transformations act linearly on the coordinates of the Minkowski space, it 
is convenient to introduce a matrix notation. First we rewrite the Lorentz transformation 
in the following form 


ze s EU 
f= £4+(y-l)—t-w?’-, 
(oY = y(22-—— “| (14.3) 
C 
This expression may be written as 
(PV YOLEN a, (14.4) 


where L", is a four-by-four matrix. As we shall discuss below, we will use a notation where 
one distinguishes between upper and lower indices. Irrespective of whether the index is up 
or down, quantities with two indices can be written as matrices, by associating the first 
index to a ’row” and the second one to a” column” index. In the case at hand, the matrix 
associated with L", is obtained by comparing (14.4) with (14.3). This reveals that L", 


can be decomposed as ; 
0ig + Bi Bi —-7 Bi 


= ; (14.5) 
—7 8; 5 
where 7 and j denote the first three index values of yw and v, while 
22. Paw yee (14.6) 
c i=1,2,3 =p 


The matrix (14.5) takes a complicated form. However, in this discussion we are not primar- 
ily interested in its explicit dependence on the velocity. Rather we want to systematically 
study the properties of the matrix L. 

In the subsequent discussion a crucial role is played by the inner product 


e-y = vy+atytay— oy? 
SE ee (14.7) 
LV 


where 7,,, is equal to 


i= 1 (14.8) 
—1 


One often uses the matrix 1,,, to define four-vectors with lower indices by'® 
t= te =a, —z°) ; (14.9) 


so that the inner product can be written as v-y = x,y“. Observe that the inner product 
is not of definite sign; x - y can be either positive, negative or zero. 

It is not difficult to show that the inner product is invariant under Lorentz transforma- 
tions. We first observe that 


big + Fer Bibs Gi 
Tuo pea 7 — — (CP are (14.10) 
V9; iy 


‘From now on we use the summation convention, according to which one sums over repeated indices. 
We also note that the definition of (14.8) is not unique. One may find another definition in the literature 
which differs by a minus sign. 


53 


where 7” is defined as the inverse of (14.8), and is thus equal to the same matrix. Multi- 
plication of (14.10) with n,, L”, gives rise to the following equation (after a renaming of 
indices) 

LE dP sg: = Tia (14.11) 


In matrix notation, (14.11) reads L7 7 L = n, where L? is the transpose of the matrix L. 
This result follows directly from identifying every first index with a row, and every second 
index with a column index. 

From (14.11) we immediately conclude that the inner product is invariant under Lorentz 
transformations, 


Vv 


LY = Noo Le, L’, ah y 
Ree 
= AS (14.12) 


Similarly, one proves that four-vectors with lower indices transform under Lorentz trans- 
formations according to 
(xp) = (LY) pae (14.13) 


Combining (14.9) and (14.13) shows once more that the inner product «-y = x,y“ = x" y, 
is invariant. 

We should stress here that the difference between vectors with upper and with lower 
indices is purely technical. Both vectors correspond to the same point in Minkowski space. 
Under a Lorentz transformation this point is changed into some other point, to which 
we can again associate a vector with upper and with lower indices. The change of this 
point in terms of a vector with upper indices is given by (14.4), while in the version 
with lower indices, it is described according to (14.13). Irrespective of the notation one 
adopts, the effect on the point in Minkowski space is the same. The identification of a 
point in Minkowski space with a vector with upper or lower indices follows from physics 
arguments, which prescribe how the Lorentz transformation acts on # and t (cf. (14.3)). 
Our conventions then dictate that x° is equal to ct and zo is equal to —ct. 

Previously we already discovered that 2? = x-2 = %-#— c’?? is invariant under 
Lorentz transformations. This quantity may be regarded as the norm of a four-vector, but 
obviously special care is required in view of the fact that x? is not necessarily positive for 
nonzero four-vectors. In (14.12) we have shown that not only x - x is invariant, but also 
the inner product x - y of any two four-vectors x and y. 

One may wonder whether there exist matrices other than those defined in (14.5) which 
leave the inner product invariant. This is indeed the case. Such matrices should satisfy 
the condition (14.11), from which one easily proves that (det L)? = 1, or 


det L = +1. (14.14) 


One can show that the matrices (14.5) have determinant equal to +1. The corresponding 
transformations are called special Lorentz transformations. Another set of matrices with 


54 


determinant equal to +1 correspond to the rotations of the spatial coordinates, and take 


the form 
Fee 0 
L4, = (14.15) 


0 1 


Here the matrix R'; represents a three-dimensional rotation. It turns out that the rotations 
and the special Lorentz transformations and their products constitute all the matrices with 
determinant equal to +1. All matrices with determinant equal to —1 can be written as a 
product of rotations and special Lorentz transformations with either one of the matrices 


si 1 
T= (14.16) 
1 = 


The transformation P is called parity reversal and corresponds to a reflection of the spatial 
part of a four-vector. The transformation T’ is called time reversal. Products of the rota- 
tions, the special Lorentz transformations and the parity reversal transformation constitute 
the so-called orthochroneous Lorentz transformations, which are characterized by the fact 
that they leave the direction of time unchanged. 

Let us now consider a special Lorentz transformation with its velocity directed along 
the x!-axis. It corresponds to the matrix 


ye, OOS 0 a8 
i= : a ; (14.17) 
7B 0 0 ¥ 


with -1 < 6 <1land y-? =1— 8”. It is sometimes convenient to use a different variable, 
namely @ = tanh x, so that 


coshy 0 0 —sinhyx 
0 1 0 0 
a 
Los 0 01 0 (14.18) 
—sinhy O O- coshyx 


This result shows that the structure of the Lorentz transformation is somewhat reminiscent 
of an ordinary rotation in four dimensions. Indeed, (14.18) takes the form of a rotation 
upon replacing x by iy. The correspondence with a rotation also follows from (14.10), 
which can be written in matrix notation as 


LO Sal? a (14.19) 


where L? denotes the transpose of L. If the matrix were the identity matrix, then L 
would be an orthogonal matrix. 


59 


The representation (14.18) is convenient when considering several Lorentz transforma- 
tions in the same direction. For instance, the product of a Lorentz transformation with 
parameter x; and a Lorentz transformation with parameter y2 yields a similar transfor- 
mation with parameter 

¥3 =X1+X2- (14.20) 


From this result we can verify the rule for the addition of parallel velocities. Namely we 


must have 
tanh yx, + tanh x2 


— : _ 14.21 
tanh x3 = tanh(x1 + x2) 1+ tanh x; tanh x9 ’ ? 
or A+B 
1 2 
_ Pi + Bo 14.22 
Bs 1+ 318o ) 7 


This formula agrees with the result found previously. 


15 Physics in Minkowski space 


Consider an observer at rest at the origin of the spatial coordinate frame. His world line co- 
incides with the x° axis in Minkowski space. Other observers moving with constant velocity 
with respect to him follow trajectories in Minkowski space that are straight lines. Two 
separate events correspond to two separate points in Minkowski space, with coordinates 
x" and y". It is meaningful to distinguish the following situations: 


e The two points are separated by a “timelike distance”, meaning that (x — y)? < 0. 
e The two points are separated by a “spacelike distance”, meaning that (x — y)? > 0. 
e The two points are separated by a “lightlike distance”, meaning that (x — y)? = 0. 


Because (2 — y)? is invariant under Lorentz transformations, this classification is the same 
in any inertial frame. Let us now discuss the characteristic features of each of these three 
cases. 

If (x — y)? = 0 then the two points are located on a possible world line of a photon. It 
is not possible by a Lorentz transformation to go to a frame where both points are either 
at the same place (7 = ¥) or at the same time (x° = y°). 

If (x — y)? > 0 then it is possible to go to a Lorentz frame where the two points have 
equal time (x° = y”). It is not possible to decide what the temporal sequence is of events 
that take place at 7” and y”. By a Lorentz transformation we can adjust the time difference 
x° — y® to any value, positive or negative. On the other hand, it is not possible to bring 
the two points at the same place (% = y) by means of a Lorentz transformation. Hence 
the two points are always spatially separated. 

If (x — y)? < 0 then it is always possible to go to a frame in which the two points are 
located at the same position in space (% = y). However, the two points will always be sep- 
arate in time, and when we restrict ourselves to orthochroneous Lorentz transformations, 
one event will always take place at an earlier time than the other event. 


56 


For an observer located at the origin of Minkowski space, there are three different and 
disconnected regions: 


e The Past: this is the region where time is negative and x? < 0. Signals traveling 
at a speed smaller than or equal to the speed of light can reach the observer from 
anywhere in this region. 


e The Future: this is the region where time is positive and x? < 0. Signals emitted by 
the observer at a speed less than or equal to the speed of light can reach every point 
in this region. 


e Elsewhere: this is the region where time can be positive or negative and x? > 0. 
No signals that travel at a speed less than or equal to the speed of light can reach 
the observer from this region, and vice versa. Since we believe that no information 
can be transmitted at a speed larger that the speed of light, this region is causally 
not connected to the observer. Note, however, that when time evolves the observer 
moves through Minkowski space, such that signals from ” elsewhere” may still be able 
to reach him, but at a some later time. 


So far we have only considered trajectories in Minkowski space that are straight lines. 
Such trajectories correspond to particles which move at some constant velocity relative 
to an observer at rest whose trajectory is parallel to the x°-axis. However, one may also 
consider more complicated trajectories, corresponding to particles that do not travel at a 
constant velocity. Because their instantaneous velocity must always be smaller than the 
speed of light, these trajectories will always evolve such that the angle with the x°-axis is 
less than 45°. Let us consider such a trajectory parametrized in terms of some arbitrary 
parameter €. Hence we have four functions x2“(€), specifying the value of the Minkowski 
coordinates for every value of €. Between € and € + d€, the particle is displaced by 


OF 
z= 15.1 
dz ae dé, (15.1) 
while the elapsed time is equal to 
Loe 
dt = — —~dé. 15.2 
a (15.2) 
Therefore the instantaneous velocity equals 
Ox°\* OF 
eye (ea ee, 15. 
we=(F) F (15.3) 


On the other hand, the proper time, i.e. the time elapsed in the instantaneous rest frame 
of the particle, is equal to 


de = Taree at 
1 Ox Ox” 
= te SS ae. (15.4) 


| i(b) 
i 2 


Figure 11: Trajectories in Minkowski space. The trajectory (a) corresponds to an observer 
at rest, while the trajectory (b) corresponds to an observer who is initially at rest and is 
then accelerated to some constant velocity. 


Therefore the proper time elapsed for a trajectory x“(€) between wi! = xr“(€,) and rf = 


x" (€9), is given by 
1 s& One Og” 
T2 n= if ae Taig BE OE (15.5) 
Observe that this quantity is manifestly invariant under Lorentz transformations. It is also 
reparametrization independent. By this we mean that if we substitute for the parameter 
€ some function of another parameter ¢, this leads to the same expression (15.5) but now 
in terms of ¢. 


16 Invariant mass 


We already observed previously that the momentum jp of a particle and its energy E divided 
by c, transform in the same fashion under a Lorentz transformation as the position % and 
the time t times c. Therefore it is convenient to define the four-vector of energy and 
momentum by 


p" = (p,p"), (16.1) 
with R 
oe (16.2) 
Cc 
Under Lorentz transformations, p” transforms as 
(pt)! = L#, p”. (16.3) 


58 


Inner products that involve the vector p” are again invariant under Lorentz transformations. 
In particular, 
2 EP 
DY = Ny Pp’ = PP —z (16.4) 
is invariant and determined by the rest mass of the particle according to 


p= —m?*c?. (16.5) 


E=cp=cyV/p2t+mic. (16.6) 


For a single particle the four-momentum is thus confined to one sheet of an hyperboloid 
defined by 


Here we made use of 


p=—m?, o > 0, (16.7) 


As long as the mass is nonzero, we can always, by means of a Lorentz transformation, 
choose a frame where p = 0. This is the rest frame. However, for massless particles this is 
problematic, since this would require a singular transformation with @ = +1. 
For two particles of mass m, and mg, the total momentum pti, = pi +p is restricted 
to (see figigure 12) 
Deg < —(m1 +m)? c*, sp? > 0. (16.8) 


Although the value of p?,, thus covers a continuous range of values for configurations of 
several particles, its actual value is invariant under Lorentz transformations. For this 
reason 

—Pbot 


Miny — 2 
Cc 


(16.9) 


is often called the invariant mass. Obviously, for a single particle there is no distinction 
between the invariant mass and the rest mass. Of course, the invariant mass can be 
defined for any system. For a system consisting of a number of free particles one can define 
a center-of-mass frame, where the total three-momentum of these particles is equal to zero, 


Prot = > Di = 0. (16.10) 


Here the index 7 labels the various particles. The invariant mass is then given by 


7 


1 1 
Many = 5 Pi = me (16.11) 


where E€™ denotes the energy of each particle as measured in the center-of-mass frame. 


59 


Figure 12: Diagram indicating the possible values of the momenta for particles with mass 
my, and mg, and the possible values for the total momentum of the two particles. The 
latter is restricted to the shaded region. 


17 ‘Transformation of forces 


In any reference frame the force is defined according to Newton’s law as the change of the 
momentum per unit time. Hence we have 
= dp 
Poe 17.1 
Ht (17.1) 
This definition allows us to determine the transformation character of a force under Lorentz 
transformations, by evaluating how the right-hand side of (17.1) changes under a Lorentz 
transformation. 
Consider a particle with momentum p” travelling with velocity wu, Under a Lorentz 
transformation p” transforms according to 


a _ @ es 
Di = (A = 25) ,  PL=PL, (17.2) 


where pj and p,; denote the momentum parallel and perpendicular to the velocity vector U 
associated with the Lorentz transformation. Now we differentiate (17.2) with respect to t, 


dp (zi Us 4 
a aee Yu —- U: ’ 


dt di’ ¢ dt 
dpi, dp. 

= C7. 
dt dt oe) 


60 


where we used the equation!” 
dE dp 
Se 17.4 
dt" dt ae) 
On the other hand, we may use the Lorentz transformation (8.4) and evaluate 


dpi, = (=) dpi, 


dt dt} dt’ 
dpi 
= ay We ea 17.5 
» (1-2) 3 (175) 
and likewise a s 
PL dp 
= y = Fj 1 < 
dp (.-5 2 +) dt! nee) 


Combining the above results, we thus find that 
dp, 8 _ dp 
— U . 


dpi BC 40Gb es dt 
dt’ Uv , 
1 
- a 
Plo _ dé (17.7) 
dt’ U-U 
Yo Fen ca 
C 


Hence we conclude that a particle travelling with a velocity u and experiencing a force 
F’, will experience a force F’ in another inertial frame, which is equal to 


# Jae?) 
| uv”? 
Gk 
Be = (17.8) 


=» t@F, @ 
ee ee ee 79) 
UU CP 
1 2 
Cc 
17Note that this equation is equivalent to 
dp" 
Pu dt =U, 


which follows directly from (16.5). 


61 


which shows that Lorentz transformations along the direction of the force leaves the force 
invariant. Note also that (17.8-9) simplify considerably when the original frame is the rest 
frame (i.e. @ = 0). In that case we obtain 


F=f, Fl =Fif1-wv/e. (17.10) 


However, under general Lorentz transformations the forces change in a complicated 
fashion. This should be compared to the situation in Newtonian mechanics, where the 
forces remain invariant under Galilei transformations. Observe also that the transformation 
rule for a force vector is more complicated than that for a four-vector. Indeed, the force 
vector does not constitute a four-vector, and neither does the velocity vector. It is possible 
to define generalizations of the force and the velocity vector that can be extended to regular 
four-vectors, but their introduction does not always lead to further insight. For instance, 
a” velocity” four-vector is sometimes used which is equal to p/m for massive particles. 
Furthermore a force four-vector can be defined by K“ = dp#/dr. Modulo a gamma factor 
the first three components of these four-vectors then define the actual velocity and force 
three-vectors. 


18 A charged particle in a uniform constant electric 
field 


The equation of motion for a particle in a uniform and constant electric field E, reads 


dj = 
eee 18.1 
ape (18.1) 


Obviously the momentum perpendicular to the electric field remains constant. The solution 
of (18.1) equals 


Dt) = PL +qE(t—to), (18.2) 
where to is some arbitrary reference time at which the momentum parallel to E vanishes, 
and p, denotes the components of the momentum perpendicular to E. In terms of the 


velocity, (18.2) reads 
muU(t) 


1 — v2(t)/c? 


Squaring this equation yields for the magnitude of the velocity, 


=p. +qE(t—t). (18.3) 


(18.4) 


Bi + @ (t — to)? E? 
UL) =e = at 
m? c2 + p? + g? (t — to)? B? 


For t large, the velocity approaches the velocity of light c, in accord with the theory of 
relativity. 


62 


Let us simplify the situation and examine the case where the particle is at rest at t = 0. 
As p, v and EF are now all pointing into the same direction, we suppress the vector notation. 
The solution for u(t) follows from equation (18.4), 


git 
v(t) = LE C. (18.5) 


v(t) = aes Om; (18.6) 


in accordance with the classical result. For large t the veclocity approaches the velocity of 
light, 
v(t) =c+ O(¢?), (18.7) 


We can also find the displacement of the particle (in the direction of the electric field) as 
a function of t by integrating (18.5), 


x(t) = x(0) 4 TE 14 () 1) (18.6) 


For small t the displacement changes quadratically with the time, 


E 2 
aay 1064, (18.7) 
2m 
For large t, we have 
x(t) = x(0) — aE. +ct+O(t™). (18.8) 


If we choose the origin of the coordinate frame such that 


x(0) = (18.9) 


then 


P(t) eb = (= (18.10) 


With this choice of coordinates the particle follows a trajectory in Minkowski space whose 
points are all related by a Lorentz transformation. 


63 


(a) (b) (c) 


Figure 13: The velocity (a) and the displacement (b) of a particle initially at rest, in a 
uniform constant electric field as a function of t. Fig. (c) shows the trajectory followed by 
the particle in Minkowski space, with the coordinate frame specified in (18.9). 


19 A charged particle in a uniform constant magnetic 
field 


The equation of motion reads 
dp 
dt 
where we used the formula for the Lorentz force. It implies that the momentum parallel 
to the magnetic field must be constant, as this force is perpendicular to the momentum. 
For the same reason p? must be constant, because 
dp? _, dp 
~_=27-— = 
dt dt 
Combining these results it follows that not only the magnitude of the total velocity is 
constant, and thus the gamma factor, but also the velocity component parallel to the 
magnetic field as well as the magnitude of the velocity vector perpendicular to the magnetic 
field. Only the direction of the perpendicular velocity changes in time and the particle will 
describe a circular motion in its projection on a plane perpendicular to the magnetic field. 
To derive this result let us choose a coordinate frame in which B is directed along the 
positive z-axis, and write down the differential equation (19.1) for the velocity components 
in the x-y plane, 


=qix B, (19.1) 


0. (19.2) 


dvyz qB 


a — Vv ; 
dt My * 


64 


DE nod (19.3) 
dt My 


It is convenient to adopt a 2 x 2 matrix notation and write (19.3) as 


oye 0 w\_ 
que (t) = seng e a, (19.4) 


where w is the so-called cyclotron frequency, defined as 


B 
p= | aol (19.5) 
mo 
which depends on the energy of the particle. The solution of (19.3) is given by 
a coswt sinwt\ 
v(t) = sgnq U_ (0). (19.6) 
—sinwt coswt 


The transverse position of the particle follows from (19.6), 


- > sen sinwt —coswt\ 
ri sTr? + end ( ) oui), (19.7) 
Ww cos wt sin wt 


where the particle follows a helical motion with transverse velocity v.(0). The radius of 


this motion is equal to 

pe), 
a) 

The product of the radius and the magnetic field determines directly the transverse momen- 

tum of the particle, a phenomenon that is exploited in detectors to measure the momentum 

of a charged elementary particle, 


(19.8) 


gee UIE aie (19.9) 
q q 


BR 


20 Charge and current densities 


In the theory of electromagnetism one deals with electric and magnetic fields and with 
charge and current densities. In most cases all these quantities depend on space and time. 
The charge and current density p(Z,t) and J(2,t) define the distribution of electric charge. 
The charge density gives the charge per unit volume, while the current density gives the 
amount of charge crossing a unit area per unit time. According to experiments, electric 
charge is conserved. Thus the total electric charge contained in some volume can only 
change by the amount of charge moving across the boundary of that volume. This implies 
that the charge and current density are subject to the continuity equation, 
ah oe Ope 


; eh rs=i()p, 20.1 
V-F+ 5 =0 (20.1) 


65 


Figure 14: A charged particle moving in a uniform constant magnetic field. 


In this section we discuss how these densities transform under Lorentz transformations. 
Obviously, according to the relativity postulate, we must require that electric charge is 
conserved in any inertial frame. This implies that the condition (20.1) must be Lorentz 
invariant. 

In order to determine what this implies for the transformation rule for the charge 
and current densities, we must first determine how derivatives transform under Lorentz 
transformations. Consider some function f(x) of the space-time coordinates x“. In another 
inertial frame we have a function f’, which is defined by the requirement that its value at the 
transformed coordinates (a")’, which follow from a corresponding Lorentz transformation, 
is equal to the value of the function f taken at the old coordinates. Hence we have 


f(a’) = fle). (20.2) 


Here we have implicitly assumed that the function f transforms as a scalar. For instance, if 
we had several functions f specifying the value of the components of some vector quantity, 
say the components of the electric field, then the definition (20.2) should be modified, 
because the components are always specified with respect to some reference system, which 
changes when we change the coordinate frame. These remarks will be useful shortly, but 
first we just consider a single function f. 

It is now rather straightforward to determine how derivatives of a function transform 
under Lorentz transformations. First we define 


O O -O, O8 oO s LO 
On art (ar Ox?’ Ox?’ ia) = (. 3) (20.3) 


Now we differentiate (20.2), 


alee (20.4) 
This result still holds for any redefinition of x into x’. However, if x and 2’ are related by 


66 


a Lorentz transformation such as defined in (14.4), we write (20.4) as 


@) V 0 / / 
an (t) = Lipa Fle), (20.5) 
or, 
On f(x!) = (L")"n vf (x) - (20.6) 

Comparing this result to (20.2) shows that the derivative (20.3) transforms as a four-vector 
with lower indices (cf. 14.13). 

Now it is easy to see how the charge and current densities should transform in order that 
the continuity equation (20.1) be invariant. Namely we combine the charge and current 
densities into a four-component vector J“ with J° = cp, viz., 


J*(a) = (J(a), cola) , (20.7) 
which is assumed to transform under Lorentz transformations as a four-vector, 7.e., 
Se = LP d™ an) (20.8) 
Combining (20.6) and (20.8) then shows that (20.1), which may be written as 
Ops =, (20.9) 


is invariant under Lorentz transformations, so that electric charge is locally conserved in 
any inertial frame. 

Transformation rules such as (20.8) can be generalized to object that carry several 
four-vector indices. Such objects are called tensors. We will encounter an example of such 
a tensor shortly. 

We may verify the above result by considering a more practical situation, where charge 
is carried by one type of particle. If at some space-time point the velocity of these particles 
is u, then the density n of these particles as measured in the laboratory is related to the 
density ng measured in the instantaneous local rest frame by 


WS GG (20.10) 


This relation is a direct consequence of the Lorentz contraction of a small volume element. 
As the volume becomes smaller, the particle density has to increase by the same factor in 
order that the total number of particles in the volume be the same. We may now express 
the charge and current density in the laboratory frame as 


> 


P= uN, fy 1p A (20.11) 
where q is the charge of the particles. This result can be summarized as 


ee pe 
m 


Ji! (20.12) 


67 


Because m, g and no are invariant under Lorentz transformations, the current J” transforms 
as a regular four-vector, in accord with the conclusion reached above. 

Because the charge-current distributions transform as a four-vector, they can be char- 
acterized (at least locally, at a given space-time point) in a Lorentz-invariant way, just as 
the space-time and momentum vectors. For instance, we may have a time-like vector J", 
which satisfies (at a given space-time point) 


Bien rar cm (20.13) 


By means of a Lorentz transformation, we can always bring it into a form where J =0 
locally. Hence a more appropriate name for this situation is that the current is ” charge- 
like”. The most obvious example of this situation is when in some inertial frame, we 
have charges at rest, so that J = 0 and p is not zero (we assume that the charges do 
not neutralize each other). Another situation is the ”current-like” current, which in some 
inertial frame corresponds to p = 0 and J # (0. This happens when we have moving carriers 
of positive and negative charges which neutralize each other, but still give rise to a nonzero 
current. Observe that an electrically neutral conductor with a nonzero electric current is 
no longer electrically neutral in some other inertial frame! 


21 Invariance of Maxwell’s equations 


Before Einstein formulated the special theory of relativity it was known that Maxwell’s 
equations are invariant under Lorentz transformations, albeit that the implications of this 
fact were only properly assessed much later. We will now prove that the Maxwell equations 
are indeed consistent with the theory of special relativity. As a result we will obtain the 
appropriate Lorentz transformations for the electromagnetic fields. 

The inhomogeneous Maxwell equations in vacuum are as follows 


ao se SOB. “ 
Vx B-=— = J 21.1 
x C2 Ot Hod, ( ) 
Wie oS oe (21.2) 
Eo 


where €9 and flo are the permittivity and the permeability in vacuum, respectively. Note 
that these two quantities are related to the speed of light, 
1 
C= , (21.3) 
Eo Lo 


In addition there are the two homogeneous equations 


ee. SOB 
E —— 21.4 
VscH Ss a Gs (21.4) 
Ves =. 0 (21.5) 


68 


The latter two equations can be solved by expressing E and B in terms of a scalar and a 
vector potential, V and A, respectively 


i Sgn OA 
BS ays 
ee 
B= VxA. (21.6) 


It is possible to change the potentials by means of so-called gauge transformations, without 
affecting the electromagnetic fields EF and B. These gauge transformations are 


Vs Veave (21.7) 


Let us now substitute (21.6) into (21.1) and (21.2). Straightforward calculation leads 
to the following equations 


Sect MB rN at. tae, , 
= 2 eas, | of -At+——) = 
~ c? Ot? ree ea | Ho J, 
a 1 0? Offs 7, 1oV p 
ee e Pek ee ek See) es Wi 21.8 
(* a aa) > (* : >| aC HOP (21.8) 


As we know that J and cp should constitute a four-vector, we conclude that the above 
equations should transform among themselves under Lorentz transformations. In order 
that the left-hand side of the equations transform in the same way as the right-hand side, 
we must also combine A and V into a four-vector. Indeed, writing 


A, = (A, serv: or ASA. eV): (21.9) 
we can easily combine the two equations (21.8) into a single equation, 
O OpAg =O NO pA) ped yx (21.10) 


This equation is manifestly covariant under Lorentz transformations. By this we mean 
that the four equations contained in (21.10) transform among themselves under Lorentz 
transformations as the components of a four-vector. The four-vector A,, is also called the 
vector potential. 

It is convenient to define the following antisymmetric tensor, 


Fin = C(OvAy = OA) 4 (21.11) 
in terms of which (21.10) can be written as 
OE ip Cpl pie (2142) 


The tensor F’,, is invariant under gauge transformations (21.7), which, in four-vector no- 
tation, read 
Ay > A, = Ay t OA. (21.13) 


69 


The fact that F,,, contains precisely six independent gauge invariant components, which 
are expressed in terms of first-order derivatives of the vector potential, indicate that this 
tensor is closely related to the magnetic and electric fields, B and E. Indeed, the three 
components 


Fy = c(0;A; ae 0;A;) 5 (21.14) 


coincide with the components of the magnetic field. More precisely 
Fig = —Fo = c Bs, ig = —Foi= —c Bo, foe = lac Bi. (21.15) 


The remaining components, 


OV OA; 
? b= OO LUN oe = oe ht) 
are just equal to the electric field, i.e., 
Fig = —Foi = E;. (21.17) 


Often one writes F,,, as a 4 x 4 matrix, which thus takes the form 


0 cB; —cBo FE, 


Fy= (21.18) 


—-E, —-E, —-£E3 0 


Also the homogeneous Maxwell equations can be written in manifestly covariant form. 

They read 

OnF yp + VF yp + OF = 0. (21.19) 
Observe that (21.19) is fully antisymmetric in three indices. Therefore (21.19) comprises 
four equations. Choosing [vp] = [ijk] leads to (21.5), while [wvp] = [077] gives rise to 
(21.4). 

From the fact that space-time derivatives and the vector potential transform as four- 
vectors under Lorentz transformations, we derive the transformation rules for F),,. We 
find, 

F(a’) = (Ly (LZ) Fo) « (21.20) 


In order to evaluate this expression it is convenient to write it in matrix form, 


Veal O bam faa ae Bae (21.21) 


70 


Let us evaluate this expression for a Lorentz transformation with the corresponding 
velocity parameter U directed along the x°-axis. The matrix L“, for this transformation is 


Te? 3G 0 
0 1 0 0 
L4,= (21.22) 
0 0: oy =p 
Oe Sg ey 


Substituting (21.18) and (21.22) into (21.21) results leads to the following result for F’,, 


0 cB; —7(cBo = BE) (Ey = BcB2) 
—cB3 0 (cB, + BE) (Ee + BcB,) 
Foe (0133) 
y(cBy = BE) — (cB, + BE) 0 E3 
—y(E1 — BcBy) — (Ey + BcB,) ee 0 


It thus follows that the magnetic and electric fields have changed according to 


cB, = (cB, + BE), E, = yk — BcBo), 
cB, = (cB. - Sk), By = y(E2+ ScBr), (21.24) 
eB, = cB ; ES = E3 5 


We can cast this result back in vector notation and obtain the more general result for a 
Lorentz transformation of the magnetic and electric fields, 


Bila’) = Byi(2), Ej(a') = E(x), 
Bi(x’') = 7(Bi(x) - 30x Ela), Ei(x’) = 7(E.(2) +0 Ble) 


(21.25) 
Just as for four-vectors there are Lorentz-invariant bilinears of the tensor F”,. The 
following two expressions bilinear in the magnetic and electric field components are indeed 

invariant, 
E?-?B*? and E-B, (21.26) 


as can be verified by explicit computation. To do this, we first note the vector identities 
(@x £)-(¢x B) = 0? (#,-B,), 


E.(¢xB) = -B-(xE), (21.27) 


71 


which hold for any three vectors 0, E and B. Substituting the transformation rules (21.25) 
into the quantities (21.26) and making use of the two identities (21.27) then shows that 
these quantities remain unchanged under Lorentz transformations. This result can also be 
derived in four-component notation, by virtue of the identities 


Fy, FY = -2(B?- 2B), elt¥?? Fy Pye = —8cE- B, (21.28) 


where é“”?? is the fully antisymmetric Levi-Civita symbol, which is equal to 1 if |uvpo] is 
an even permutation of [0123], to —1 if [uvpo] is an odd permutation of [0123], and to 0 
otherwise. 


22 Electromagnetic forces acting on a point charge 


Now that we have determined how the electromagnetic fields transform under Lorentz 
transformations, we should verify that the forces derived from these fields transform in ac- 
cordance with the result established earlier. In principle, this is a straightforward exercise. 
Consider, for instance, a point charge g moving with velocity w in an electromagnetic field. 
The charge feels a force equal to 


F=q(E+@xB). (22.1) 
With the help of (21.18) one readily verifies that this can be written as (in components) 
F,=q (Fio Pek uj) ; (22.2) 


where Fig and F;; are the components of the tensor F’,, with indices 0 and 7, 7 = 1, 2, 3 
(to be distinguished from the components of the force vector F on the left-hand side of 
(22.2)). This result takes the following form when expressed in terms of the momentum, 
rather than the velocity of the particle, 


qd 
FF, = —— F,,,p". 22.3 
MOV uP ( ) 


In this expression we note the presence of the first three components of a four-vector 
K,,, defined by 
|: ome ary cae (22.4) 


which satisfies the condition 
pK, =0, (22.5) 


by virtue of the antisymmetry of the tensor F),,. Written in components, (22.5) reads 


p? Ko+p' K, = 0, or ko = - =--i-K. (22.6) 


ie. 


Under Lorentz transformations kK, transforms as as a four-vector, as it is constructed from 
the product of two quantities that each transform covariantly, by means of a summation 
over an upper and a lower index. For our purpose, the relevant transformations are 


2 B 
Ki = (Kit Ko-), 
Ki = Ki, (22.7) 


where the plus sign in the first equation is related to the fact that we transform a four-vector 
with lower indices. Substituting (22.6) into (22.7) gives 


op a vu oe 
ky = yw ( Ky alt R)), 
Ki = Ki, (22.8) 
Now we have to take into account the factor y,' in (22.3), which transforms according 
to Suh 
U-U 


Combining (22.8) and (22.9) then shows that F' as defined in (22.1) transforms indeed in 
accordance with (17.8). 


23 Electromagnetic fields generated by a moving point 
charge 


As an application of the transformation rules for magnetic and electromagnetic fields, we 
consider a point charge moving gq at uniform velocity ¥. We assume that the point charge 
is located in the origin of its rest frame S’. In this frame the electromagnetic fields are 
equal to 


>) 
E'(r",t) = eo a B' (r,t) =0, (23.1) 


where 7” denotes the position vector in S’. At time t’ = 0 the observer is closest to the 
point charge. We assume that at t’ = 0 the origin of S’ coincides with the origin of the rest 
frame of the observer, denoted by S. Note that the observer is not located at the origin 
of S’. In S the space-time coordinates are 7 and t. They are related to the space-time 
coordinates of S’ in the following manner, 


Fy = (rj) — ve), f= Ty 
U:Tr 


73 


charge = charge 
@ > U e 


TL S S! 


e —U ——— 6 


observer observer 


(a) (b) 


Figure 15: A point charge moving with velocity v as seen in the rest frame S of an observer 
and in the rest frame S’ of the charge. The position of the charge in S at t = 0, and of the 
observer in S’ at t’ = 0 is indicated. Note that the smallest distance between the observer 
and the charge is the same in both frames and given by |7",| = |7"(]. 


From now on let 7’ and t, and 7” and t’ denote the space-time coordinates of the observer 
in S and S’, respectively. Therefore we have 
Pi. = 0; oe ae 
| * ° (23.3) 
rT, = ~HwUt=—ve, t=. Ast 


Now let us apply the transformation rules (21.25) to determine the fields in S. For the 
components parallel to v, one finds 

ve q "| cae Sid ve 

Ey (7, t’) = Ar eo (ni? Be Fo = E\\(7, t), Bur", t') — 0 = By (r,t) ia (23.4) 


This result must be expressed in terms of the coordinates 7 and t. Substitution of (23.3) 
gives 
qd may ut 
drag PE RUE 
The determination of the field in directions perpendicular to v requires a little more 
work. Using again (21.25) we have 


E\(F,t) = By (r,t) =0. (23.5) 


3 2 1 3 
Birt) = 0=7(BilF,t) - 5ex E(F,d), 
v Cc 
df 
Ele) = —? rt —», (EB. (rt) +0x Birt). 23.6 
an ) ) Att E9 Par ee / ( Lr, )+u~x (7, )) ( ) 


From the first equation it follows that 


e 1 


Bi(F,t) = 0x E(7,t). (23.7) 


9 


74 


| (b) 
1 2 


Figure 16: The magnetic field lines and the transverse components of the electric field lines 
generated by a point charge moving at constant velocity. 


This yields 
2 
ee ae 
dxB=-ZE,. (23.8) 
Substituting this result into the second equation of (23.6) gives 


Fer, 5) =F Ar €o (72 +92 ) (2)3/2 ’ 


(23.9) 


where we have again used (23.3) to write the field in terms of 7 and t. Combining (23.7) 
and (23.9) gives the following expression for the magnetic field, 


R= qd [0 YUXT 
A An (7? + y2 v? t?)3/2 ’ oa) 
where we have used (21.3). For small velocities, this result is in agreement with the Biot- 
Savart law. 

Finally, we should point out that the above results can also be obtained directly in the 
rest frame of the observer by solving Maxwell’s equations. However, this derivation is more 
complicated because of the effect of retardation. The fields observed at a time t are not 
related to the position of the charge at time t, but to its position at some earlier time, just 
because the electromagnetic fields propagate with the velocity of light. Therefore it takes 
a certain amount of time before the fields reach the observer. Retardation is the reason 
why the terms in the denominators of the fields E and B have such a conspicuous form. 


75 


| (b) 
1 2 


Figure 17: The magnitude of the electric field components generated by a point charge 
moving at uniform velocity as a function of time. The magnitude of the magnetic field B 
shows the same shape as E,, as follows from (23.7). The solid lines refer to low velocity, 
uv <c, the dashed lines to high velocity, v & c. 


24 


I; 
as 


oO NDA OH 


iN 


Bibliography 

. Baierlein, Newton to Einstein: the trail of light, Cambridge 1992. 
. Barton, Introduction to the relativity principle, Wiley 1999. 

. Bohm, The special theory of relativity, Benjamin, 1963. 


. Bondi, Relativity and common sense, Doubleday, 1964. 


ct OU Q FW 


. Bondi, Assumption and myth in physical theory, Cambridge, 1967. 


M. Born, Die Relativitdtstheorie Einstein, 4. Aufl., Springer, 1964 


. M.G. Bowler, Lectures on special relativity, Dover, 1986. 


. 5.T. Butler and H. Messel, Time: selected lectures on time and relativity, Pergamon, 


1965. 


. P. Couderc, Die Relativitdtstheorie, Deutsche Verlagsanstalt, 1974. 


. R. d’Inverno, Introducing Einsteins Relativity, Oxford Univ. Press, 1992. 


A. Einstein, Uber die spezielle und die allgemeine Relativitdtstheorie, Vieweg, 21 
Aufl., 1973. 


76 


Le 
13. 
14. 
15. 
16. 
ies 
18. 
19. 
20. 
yA 
Pa 
23. 
24. 
2 
26. 
at. 
28. 
29. 
30. 


ol. 
32. 


33. 
34. 


39 
36 


J.L.A. Francey, Relativity, Longman, 1974. 

A.P. French, Special relativity, Nelson, 1968, paperback. 

R.H. Good, Basic concepts of relativity, Reinhold, 1968. 

T.M. Helliwell, Introduction to special relativity, Allyn and Bacon, 1966. 

B. Hoffmann, Relativity and its roots, Freeman, 1983. 

C. Kacser, Introduction to the special theory of relativity, Prentice, Hall, 1967. 
R. Katz, An introduction to the speciall theory of relativity, Van Nostrand, 1964. 
A. Kramer, Relativitdtstheorie, Schroedel, 1977. 

D.F. Lawden, Elements of relativity theory, John Wiley, 1985. 

Lorentz, Einstein, The principle of relativity, Dover (orininal papers). 

L. Marder, An introduction to relativity, Longmans, 1968. 

J. Marks, Relativity, G. Chapman, 1972. Relativitatstheorie, Physik Verlag, 1974. 
W.D. McComb, Dynamics and Relativity, Oxford U. Press, 1999. 

N.D. Mermin, Space and time in relativity, McGraw-Hill, 1968. 

D.E. Mook and T. Vargish, Inside relativity, Princeton Univ. Press, 1987. 

H. Muirhead, The special theory of relativity, MacMillan, 1973. 

E.P. Ney, Electromagnetism and relativity, Harper and Row, 1962. 

R. Resnick, Introduction to special relativity, Wiley, 1968. 


R. Resnick and D. Halliday, Basic concepts in relativity and early quantum theory, 
2nd ed., John Wiley, 1985. 


W. Rindler, Introduction to special relativity (second ed.), Oxford Univ. Press, 1991. 


W. Rossner and R.K. McCullock, Relativity and high energy physics, Wykeman Pub- 
lications, 1969. 


W.G.V. Rosser, Introductory relativity, Butterworths, 1967. 

U.E. Schroder, Spezielle Relativitatstheorie, Deutsch, 1981. 

R. Sexl und H.K. Schmidt, Raum-Zeit-Relativitat, Rororo Vieweg, 1978. 
R. Skinner, Relativity, Blaisdell, 1969. Reprinted Dover Publ. 1982. 


ree 


37 
38 
39 
40 
Al 


. J.H. Smith, Introduction to special relativity, Benjamin, 1965. 

. E. Taylor and J.A. Wheeler, Spacetime physics (second ed.), Freeman, 1966. 
. R.E. Turner, Relativity physics, Routledge & Kegan Paul, 1984. 

. J.G. Taylor, Special relativity, Oxford Physics Series, 1975. 


. B.A. Westwood, Relativity, MacMillan, 1971. 


78 


